[nycbug-talk] hadoop - sharing a server?
Charles Sprickman
spork at bway.net
Tue May 11 23:55:24 EDT 2010
Hi all,
I just recently went back and listened to the hadoop presentation from a
few months ago. The timing was great, as I've been tasked with setting up
a basic hadoop environment for pulling some stats out of a ton of mail
logs. We'll likely be using HBase, but will be looking at Pig as well.
I have a 3-node test setup running on FreeBSD 8.0 in VMWare. I was
pleasantly surprised that Java was not a real pain to get going. In
short, this all looks good, and it looks like it would be easy enough to
copy one of these nodes to a jail, archive that jail, and then deploy a
bunch of these things all over the place.
So my question... What we're looking to do with Hadoop does not yet
justify going out and buying a half dozen or so servers. I'd like to jail
it on a bunch of our existing servers. The nature of the load on these
things is that they have widely varying workloads with many lulls during
the day. The nature of the jobs we want to run on the hadoop cluster is
that basically we can wait as long as it takes for now. So is anyone
running hadoop nodes on servers not dedicated to this task? Does it
respond to being niced down? Are there some resource utiliztion knobs
I've missed in all the quicky howto's I've read?
Thanks,
Charles
More information about the talk
mailing list