Dioscuri: MapReduce Made Simple

Wednesday, May 14, 2008

Invitation to the GridGain gang

This is an invitation to the GridGain gang... would you guys be up to demonstrate all the code that I'd need to build the same map/reduce system used in the example from the article, and post it here?

I'm interested in whether it's possible to build the same thing with as few lines of code and no GridGain-specific API calls as the Dioscuri sample implementation does. Your comparison system should be built with only J2SE classes, allowing for one call to the GridGain API for setup (like the Dioscuri sample does in one class). Don't include the generation of the final result -- that's a discussion for the next article and the Dioscuri sample uses Terracotta for aggregation instead of the file system. Please do include the same number of processing stages using the same algorithms and heuristics as in the sample app.

Once the system is built, I'd be curious to measure the completion time for the same stages in both systems using the same number of nodes in any topology that we chose.

This way we can all put the arguments to rest and learn something in the process. GridGain may outperform Dioscuri -- now, is it easier to program the app than just deploying POJOs, though?

Thanks!

Monday, May 12, 2008

Welcome to Dioscuri!

The Dioscuri Project is an experiment in the implementation of a MapReduce system based on off-the-shelf, open-source components. The system is based on my work in real life as well as the publication of two articles for TheServerSide.com:
The target system configuration looks something like:


Here is a list of all the files discussed in the MapReduce II article:
Where do we go from here? The project will be included in the MuleForge soon. Stay tuned for more information about that...