[nycbug-talk] mapreduce, hadoop, and UNIX
Alex Pilosov
alex at pilosoft.com
Sat Aug 25 11:34:20 EDT 2007
On Sat, 25 Aug 2007, Isaac Levy wrote:
> Afterthought addition,
>
> On Aug 25, 2007, at 10:48 AM, Isaac Levy wrote:
>
> > From kernel to userland to network, I'm dying to find similar works,
> > any help is much appreciated!
>
> E.G.:
>
> Distributed computing implementations:
> - Plan 9?
> - DragonflyBSD Clustering?
We all are hoping today to have clusters similar to what VMS had 25 years
ago - fully transparent non-shared memory clustering aka "single system
image". You don't know, and you don't care which node on the cluster the
job is running on, and jobs can be migrated to and from nodes depending on
the load.
For proper clustering, you need a distributed filesystem, distributed lock
manager, and job distribution engine.
On linux front, closest thing would be MOSIX, which is *almost* that.
Unfortunately, MOSIX is first and foremost a research project, with
restrictive licensing and fragmented community (see, openmosix). Today,
the project to have properly working clusters is openssi.org - I believe
it is based on openmosix and opengfs.
Clustering is hard, comparing to writing an OS - even Linus can do that
one.
> Data implementations:
> - Sun ZFS?
> - AFS and the like?
> - RH GFS and the like?
If you are talking about proper distributed filesystems, they are few and
far between.
gfs/opengfs
oracle ocfs
intermezzo/lustre
pvfs
veritas dfs
sgi cxfs (distributed xfs)
Distributed filesystems are hard, compared to writing an OS.
-alex
More information about the talk
mailing list