jump to navigation

Hazelcast @ NY Java SIG June 19, 2009

Posted by shawngandhi in Technology.
Tags: , , , ,
trackback

This past Wednesday was the last NY Java SIG to be hosted in Google’s manhattan location. What caught my eye was a 6 month old project known as Hazelcast. 

Hazelcast is simply a distributed List, Queue, Set, Map, and Executor Service. If you’re thinking to yourself that you’ve seen stuff like this before in Gigaspaces, ObjectGrid, etc, you’d be right. The difference here is that its open source and under the Apache License. So if you’re trying to build your own poor man’s distributed cache, you don’t have to fumble around with JGroups anymore.

Wanting to compare it to Coherence, I was curious how the map worked under the hood. I guess I shouldn’t have been surprised to find some strong similarities. Just one example: when data is put into a distributed map, it is hashed and then modulated by a number of blocks. 271 to be exact. As each node is assigned a certain number of blocks, this number is then the theoretical limit to the number of nodes you can have. This number is configurable.

In the back end, Hazelcast avoids synchronization by using a single threaded model. When performing adds/updates on entries, one Service Thread does all the work, measuring its tasks in microseconds.

Being so new and so free, Hazelcast obviously has a long way to go. Coming up in its next release is a data persistence layer. I’m sure this is one of their most requested features. Also in the pipe is a distributed Semaphore, DelayQueue, and Countdown latch. Very exciting!

It seems they’re after making the entire java.util.Concurrent package distributed, and I have no complaints about that. You can find their code on Google Code, check them out!


Comments»

1. Trevor Hinson - June 29, 2009

Hi Gotham,

Good post. Thanks for making me aware of this. Hazelcast could become an interesting alternative to the many variations along this same theme. The concept itself is very good and makes perfect sense. I advocated and implemented a “distributed cache” based on Memcached in 2005 for a client. Back then though the thought process was perhaps a little too fresh for other organisations as I struggled to get the concept understood within a more conservative organisation (which to be honest needed it the most).

More recently though there have been a number of alternatives (hadoop + all the other Java specific implementations) and the demand for this type of facility became more obvious when Oracle purchased Coherence. (Databases are too slow so to speak). The approach is not without it’s problems though.

The design pattern for their use however should, at last, start the industry questioning a number of other “advanced” solutions which are somewhat proved wrong by these approaches which in itself could be classified as common sense I suppose.

(NB: I wouldn’t touch the Java implementations).

Keep up the good posts.

Cheers

T

:)

shawngandhi - June 30, 2009

Hi Trevor — I’m glad you found it useful.

I haven’t made it over to play with Hadoop yet, but I usually find that I’m a fan of Apache projects. If you ever get around to playing with it, I’d love to hear your feedback.

Curious, how did you decide on Danga when implementing your distributed cache?

Cheers,
Shawn

2. Trevor Hinson - July 21, 2009

Hi Shawn,

Sorry I called you the wrong name in my earlier post. Apologies for that I picked it up from the blog reference at the top of the page.

Quite a while back some people finally started to realise that databases were completely wrong for particular architectural patterns (the web being the main one). That doesn’t mean that you do not use databases for the web – it is all about how you use a tool rather than the tool itself. Anyway I evaluated a number of caches, techniques and design patterns for using them. I was actually made aware of Memcached by an excellent network specialist that I was lucky enough to briefly work with at the time. Initially I favored a particular Java implementation. Upon evaluation it very quickly became obvious that they are wrong for the job. After evaluating all that was avaliable and placing them in to the context of the potential environment my client and those since would be dealing with in the future then it became more obvious … Java caches are simply not the way to go. They are perfect for, old school, approaches which I still see being implemented at the moment (typical j2ee stacks) but are way out of context within an SOA.

Danga did all that it said it would do. No more, no less. Right tool for the job basically.

Cheers

T :)

3. Roger - August 18, 2009

Hello Shawn,

good post! I was looking at Hazelcast (now on version 1.7) and it seems like a more mature product with persistence too. Definitely a better fit for my current project than the other (costly) alternatives.
Actually we had tried a while back to extend Ehcache to support all Java Collections and here comes Hazelcast! Got my work cut out!

Roger R

shawngandhi - August 19, 2009

Thanks Roger! If you end up using it, drop me a line and tell me what you think.

Nothing beats good+free software!