Monday, September 26, 2011

Cacheismo Next Steps

Most important limitation of current approach is that cacheismo cannot be used in clustered mode. The reason being virtual key hashing might choose a server different from where the actual key is stored.
hash("set:new:myKey") is not equal to hash("set$myKey") is not equal to hash("set:count:myKey")

I did some experiments with lua threads and they look pretty cheap to create. The current plan is to make cacheismo nodes aware of each other and add client side support to cacheismo server. Instead of just looking up the key in local server memory, cacheismo will also do the hashing at server side on the actual key and fetch the results from the responsible cacheismo server.  This would double the latency but ensure that any virtual key can be thrown at any server and it will fetch the results.

For example consider something like "set:intersection:mySet1:mySet2:mySet3".
Lets assume that  mySet1, mySet2 and mySet3 are stored on different servers.
Trivial implementation would be to get mySet1, mySet2 and mySet3 to server executing the request and do the processing. Another possibility is to get sizes of mySet1, mySet2 and mySet3 and execute two parallel "set:intersection:smallest:other1" and "set:intersection:smallest:other2", followed by local intersection over the result.

Since cacheismo is scriptable, it is always possible to change this logic.  I will hide the non-blocking details using lua co-routines so writing such objects will not require "knowing" the asynchronous IO aspects, just some friendly function call like getValueForKey.

It might make it possible to do some in memory map-reduce with recursive calls for virtual keys running parallel stacks spread over the network, executing at the same time.

UPDATE: Cacheismo Cluster Update

No comments:

Post a Comment