December 2008 Archives

Version 0.10.0 of Cascading is now available for download. For details check out the announcement. This release represent v1.0 Release Candidate 1.

I just completed a Proof of Concept with a client where we wanted to see if Cascading could match Pig performance. For an equivalent load, it looks like Pig rendered 580 MapReduce jobs and Cascading planned 75. Running in local mode on a small dataset, Pig completed in 31 minutes, Cascading in 7 minutes.

I've been doing a fair bit with DocBook recently. Both the Cascading User Guide and a section on Cascading, I hope to be included in the upcoming Hadoop: The Definitive Guide, were written in DocBook. Unfortunately, finding a reasonable DocBook tool chain was difficult, so I had to adopt the Velocity DocBook Framework and make some modifications. I've published a draft of my efforts on GitHub: DocBook Framework and DocBook Template.

Looks like I will be presenting on Cascading this December 17 in New York at the monthly Hadoop User Group.