Saturday, March 17, 2012

Frequency of Choice

I have talked about this earlier also, but the topic is so close to my heart that I wanted to have a dedicated post.

As far as I understand it, the crux of capitalism is choice. In economics the word choice is substituted by market. What is market? Market is where consumers exercise choice. If consumers don't have a choice it is not a market. The assumption is capitalism thrives on competition.  Competition creates choice. Consumers will choose the best products at lowest prices forcing companies to innovate and reduce prices. The best will survive.

This is all true and then not quite true. Two main problems:

  • Most people don't like to think. Even  if they can, the complexity of world is sufficiently high that figuring out what is best for them is close to impossible. Eventually it is either brands or price because they make the decision simple. 
  • Frequency of choice. Since this is all I want to talk about, I will use the next paragraph.
We are good at stuff we do often, the old practice makes the man perfect thing. We buy petrol, vegetables, groceries, etc almost every day. Prices changes are felt, drop in quality is noticed. But then their are things that we don't do often. Things like joining new job, getting married, buying car or home, taking a loan, choosing college, getting home painted, buying TV or refrigerator or AC, casting our vote, choosing a laptop or OS, choosing email client, signing up on a social network, etc.  Many of these choices are irreversible or if not irreversible then choosing the alter our choice is very expensive.  This is where capitalism fails miserably because it is no longer about choosing from alternatives but the choice of altering our choice. For products with short life spans like vegetables or toothpaste altering a choice is not expensive. Vegetables will last few days, toothpaste few weeks and you can choose better product next time, but with products that last years or decades or in some cases lifetimes, it is the altering of choice which is required not choice among products. 

Specifics:
 
Consider home loan business in India. Floating rates have been around for long time now.  What do they float on is unknown and once you take the loan you realize that the "unknown" is whim of the Bank. Usually your floating home loan interest rate will increase by 20%-40% within few months of taking the loan and now their is no choice.  Well their is a choice to switch to other home loan, but only if you pay 2%-4% of your home loan value as switching charges.  This is as monopolistic as it gets and we call it capitalism, the mecca of markets and choice.  Even banks don't know if they are giving a good/bad interest rate to the customer, then how can customer decide if he is getting a good deal and that deal is good enough for the next 20 years. No one can. The only way I can know if I getting a good deal is if I can switch my home loan any moment I desire to switch. That is what will make it a market.

The same happens when switching a job (notice period), casting a vote (5 years gap), buying a car (10% value drop when it get out of the showroom) and at many other places. In computers, the advent of SAAS based companies have started filling this gap by providing monthly choice to the customers to continue to use them which once was a difficult choice of finding the best product. Amazon EC2 gives choice to use machines by hour and OS by hour. I think governments which call themselves followers of capitalism have missed a point.  It is not the choice alone that matters, it is the frequency of the choice that is at the core of efficient markets.  

Tuesday, March 06, 2012

Threadpool and the task queue

Every architecture makes way for threadpool and a task queue.  Multiple thread wait on the queue always ready to pick up the next task and execute them. Once implemented, the next task is tuning it. How many threads? What is the size of the queue? Blocking queue or throws error on full? Retry handler?

Before you start worrying about this ask a simple question. How much time does it takes to execute the task? If it is not at least couple of orders of magnitude greater than time it takes to do context switch, don't bother about the threadpool/queue, just execute it right there, on your current thread.

Here is why?
  • Task queue has a lock. More threads and more often it is accessed, more contention, more time to submit the task. Extra context switch just to acquire the lock. Basically you are doing serialization before getting to parallelism here. More threads + more tasks => more time per insert. Think of it like talking to a customer care executive(CCE). You do lots of IO using IVR and finally reach the CCE and the guy instead of answering your questions connects you to another guy and you need to explain the problem once again. That is pretty much how context switch works. If you need to talk for 10-20 minutes, it might be worth it, but if all it takes is few seconds of conversation, it just wastes time.   
  • Once the task is submitted, it need to wakeup some thread. That is context switch, costs time.
  • By the time this new thread wakes up because of lock and time elapsed most of the variable it needs are out of cache...more time. Read lock semantics for JVM. 
  • How do you do error handling from the task? Extra code, extra states.
You can avoid all this by executing the task inline....normal function call. It will run faster.  It is easy to write/debug. The assumption here is that the task really takes short time to execute and it mostly cpu intensive. Webserver using threadpool is understandable. Single request might need to do file IO, access some locked resources, possibly make multiple database queries. These are kind of things that make sense in threadpool...things that are complex enough to be simplified by using  a new/dedicated "thread of execution".  For other things, function call is the most efficient.  

Monday, March 05, 2012

Out of select

Some non-blocking architectures use separate threads for IO and others as worker threads. One problem with this design is extra latency because even though the worker is done with its work, the IO thread is blocked on select/epoll. One simple way to wake up selector under such conditions is to open a pipe per selector thread. The read end is made part of selector fd set and the write end is used by the worker thread to write one byte on the pipe which wakes up the selector. This ensures that whenever new fd needs to be registered with select, it wakes up as soon as possible instead of timing out on read timeout.

Wednesday, February 08, 2012

Adsense for TV adds

TV adds is $64 billion industry in USA alone.  And it works on the TRP rating systems which are nothing but statistical formulas applied on TV viewing timings of few thousand homes per country.  The simplest way to disrupt this market is by controlling the remote control. A wifi-enabled Android/iOS running remote can give real time information about who is watching which channel at what point in time.  And once you have it, you already have Adsense for TV.

And that is what I think Apple and Google are going to fight for over the next few years.

What is the mystery “entertainment device” Google is testing?

Apple patents new touchscreen remote control for a future Apple TV

IntoNow, the ipad app purchased by yahoo was a nice step in this direction using the audio matching technology. I guess this whole set of social TV apps is mostly about finding who is watching what at any point in time.

The same objective can be fulfilled by making TV's more intelligent (Samsung TV apps) and also by network connected setup boxes.  But I feel universal remove is a much cleaner and simpler way to go about doing this. If google were to crack this, they will not only be the entry point to anything we do on the browser but also entry point to all devices we use in our houses.  When we switch on/off lights, how many times does microwave is used, how much TV we watch, etc.

It will be interesting to see how they market this and at what price points.
 

Thursday, February 02, 2012

Cacheismo learnings

I already knew lua, memory management, writing servers and other technical bits. I did learned the automake stuff to make sure people can compile it.

But the best part was something else. Marketing. I guess I failed miserably at that one. I don't know if anyone has downloaded cacheismo code and tried to compile it or is anyone using it. I think it is one of my best works and it is free and I don't know what more do I need to do to convince people that it is better that memcached.

I tried the following:

  • I wrote a mail on memcached group explaining cacheismo.
  • I wrote to author of highscalability.com blog. He was kind enough to include a link. 
  • I created cacheismo google group. Only my friends joined. (Thanks!). No questions so far.
  • I tried to answer some of the questions on stackoverflow about memcached. I looked at problems which people face but can't solve with memcached. Tried answering the questions to best of my knowledge and also provided information about how it can solved using cacheismo. Someone removed all my posts :( from stackoverflow. 

So I guess even if their exist people who might find cacheismo useful, it is kind of impossible for them to find it, unless of-course they magically search for cacheismo on google.  So the question is what is the plan? And the answer is nothing much. I am not actively working on cacheismo. I will be more than happy to help anyone who wants to use it.  I need to solve the discovery problem and the plan there is to keep posting to stackoverflow...until the person who deletes my posts gets tired of it. Quora is another option. And may be some videos on youtube.  

May be caching is not such a big problem for people and memcached is good enough. Well in that case I will write some more servers. Http Proxy something like haproxy but configurable in lua might be fun. Or may be websocket server for HTML 5 applications.