May 2008 Archives

Hadoop & EC2

| | Comments (0) | TrackBacks (0)

Hadoop 0.17.0 is now generally available. This also means there are new scripts for managing EC2 clusters using the new EC2 features like 'availability zones', the new optimized kernels, 32 and 64 bit images, and Ganglia. Also looks like Tom has already packaged new public AMI's as well. You can read about the changes here on the Hadoop Wiki EC2 page. Here also is the JIRA issue with the patches.

Thought I would quickly post this link to the Hadoop wiki comparing GridGain to Hadoop. In summary, Hadoop was designed for large data applications. GridGain is simply a re-imagining of tuple-spaces with constraints on available JVM memory (as implied by the comparison). Hopefully I'll post my own opinions at a later date. [Update] A reaction to the comparison has been posted.[Update Sept 2008] GridGain isn't even a data-grid, but a means to distribute apps into running remote kernels, with a bit of Spring like pluggability. A comparison is disingenuous.

After much poking around and experimentation, we just packaged and released 0.1.0 of Cascading.groovy, our Groovy language interpreter extension. Read more on the Cascading site.

Version 0.5.0 of Cascading is now available for download. For details check out the announcement.