Hadoop 0.17.0 is now generally available. This also means there are new scripts for managing EC2 clusters using the new EC2 features like 'availability zones', the new optimized kernels, 32 and 64 bit images, and Ganglia. Also looks like Tom has already packaged new public AMI's as well. You can read about the changes here on the Hadoop Wiki EC2 page. Here also is the JIRA issue with the patches.
Quick note on Ganglia. Seems some of the metrics aren't being emitted from Hadoop, there is a patch pending to fix this, but note Ganglia still provides much useful information about your cluster instances. The corresponding JIRA issue.
[Update]
Forgot to mention the Hadoop team stated, last night at the Hadoop User Group meeting, that 0.17.0 performed ~30% faster on the "grid mix" test (a very long running test, run on a few hundred machines, if I remember correctly).
Leave a comment