[nycbug-talk] mapreduce, hadoop, and UNIX
Isaac Levy
ike at lesmuug.org
Sat Aug 25 11:47:11 EDT 2007
On Aug 25, 2007, at 11:22 AM, Alex Pilosov wrote:
> On Sat, 25 Aug 2007, Isaac Levy wrote:
>
>> can anyone shed some light on similar prior works in distributed
>> computing and RPC systems which are 'old classics' in UNIX? These
>> distributed computing problems simply can't be new.
>>
>> To be really straight, what I'm getting at, is why is this more or
>> less
>> useful than intelligently piping commands through ssh? What about
>> older
>> UNIX rpc mechanisms? Aren't there patterns in even kernel source
>> code
>> which match this work, or are even computationally more sophisticated
>> and advanced?
> mapreduce is most of all, an API. Unix is contrary to idea of APIs
> (everything is a stream of bytes).
Damn good observation.
Guess that's why Pike wrote the 'Sawzall' utility on top of it :)
>
> mapreduce isn't really rocket science by any means, see below.
>
>> From kernel to userland to network, I'm dying to find similar works,
>> any help is much appreciated!
> Similar things to look at: PVM and MPI -
AWESOME, exactly what I was wanting to grok- Thanks Alex!
--
Links for this thread, for the record:
PVM (created 1989, currently actively maintained):
http://www.csm.ornl.gov/pvm/
http://en.wikipedia.org/wiki/Parallel_Virtual_Machine
MPI (created 1990s, man implementations in various contexts/languages):
http://en.wikipedia.org/wiki/Message_Passing_Interface
http://www.mpi-forum.org/
> these are APIs for non-shared
> memory, message passing, distributed computation. They are an order of
> magnitude more involved than mapreduce - they are much more generic.
> mapreduce can be easily implemented using PVM but not vice versa.
>
> mapreduce is optimal for 'embarassingly parallel' jobs - ones that are
> very easy to paralellize. There hasn't been much research into that
> - its
> been a solved problem 40 years ago.
Not surprised. :)
However powerful the simple idea of MapReduce is, there seems to be
far too much hype over it all IMHO- and lots of confusion about
applying it in discussions online, (when all you have is a hammer,
everything is a nail...)
Looking at it in historical context is very useful here.
Rocket- and thanks Alex!
.ike
More information about the talk
mailing list