Got Ads?
1/18/2008
  Why You Should Learn Map Reduce & BigTable
Rich Skrenta points to a web article by database experts Michael Stonebraker and David Dewitt, as they exude a fusillade of hate on Google's Map / Reduce computing model.

Skrenta dubs their screed: "The sound of disruption".

Basically they mis-understand the purpose of the thing. One thing Map / Reduce is great at is processing log files. Databases aren't so hot when you have 100M things a day or more to look at.

As I wrote earlier, Google is making efforts to get college kids to learn to think in map / reduce ways. Now they are offering free access to scientific datasets in mapreduce clusters to certain universities.

The upshot is that the web requires parallel processing. No one has really extracted a lot of knowledge out of the terabytes of web usage data that flow by every day.

But the data is out there, and paradigms like map / reduce are how it's gonna be dissected. So if you want to work in the consumer web, with billions of users doing stuff every day, leaving data tracks, you should spend some time learning map / reduce.

Not to get too geeky, but to collect these in a single place, here are some key papers to read if you want to understand the google architecture:

And a bit of bonus inspiration - the story of how an New York Times blogger converted 70 years of archives (over 11 million articles) to PDF in under a day using Hadoop on Amazon EC2.

Labels: ,

 


Links to this post:

Create a Link



<< Home

Subscribe to GotAds?



Links



Recent Posts

Why You Should Learn Map Reduce & BigTable


Archives

February 2005 /  March 2005 /  April 2005 /  May 2005 /  June 2005 /  July 2005 /  August 2005 /  September 2005 /  October 2005 /  November 2005 /  December 2005 /  January 2006 /  February 2006 /  March 2006 /  April 2006 /  May 2006 /  June 2006 /  July 2006 /  August 2006 /  September 2006 /  October 2006 /  November 2006 /  December 2006 /  January 2007 /  February 2007 /  March 2007 /  April 2007 /  May 2007 /  June 2007 /  July 2007 /  August 2007 /  September 2007 /  October 2007 /  November 2007 /  December 2007 /  January 2008 /  February 2008 /  March 2008 /  April 2008 /  May 2008 /  June 2008 /  July 2008 /  August 2008 /  September 2008 /  November 2008 /  December 2008 /  January 2009 /  March 2009 /