[nycbug-talk] hadoop - sharing a server?

Edward Capriolo edlinuxguru at gmail.com
Tue May 18 00:27:05 EDT 2010

On Mon, May 17, 2010 at 10:37 PM, Charles Sprickman <spork at bway.net> wrote:

> On Mon, 17 May 2010, Edward Capriolo wrote:
>  You can jail anything of course. The issue with jailing hadoop is that it
>> is
>> very IO heavy because data is constantly being spilled to disk. Even if
>> your
>> jail can limit memory or processor ticks the real problem is jails do not
>> protect your disk. Now if you system is only being used for background
>> batch
>> processing that is fine. However, if you are trying to run a "real time"
>> ish
>> mysql instance and hadoop on the same they may not play together well if
>> they fight for the disk. Same is true with any jail/vm solution, but
>> hadoop
>> batching likes to saturate things with load.
> Thanks for the excellent feedback...  Right now I just need to get
> something up for various reasons:
> -Evaluate Hadoop/HBase/Pig running on multiple hosts
> -Get myself up to speed on Hadoop and to some extent, Java from a sysadmin
> perspective
> -Get the folks that will be using this an environment to evaluate it and
> see if this is the proper set of tools to do the type of data analysis they
> want to do
> -Shake out any BSD-specific issues
> If this all goes well, we'd likely just bring up a few cheap servers as a
> standalone cluster.
> Until then, the idea of jailing it on servers that have very sporadic usage
> patterns and don't have to really do stuff in "real time" seems like it
> might be a good compromise.  I'll be throwing this onto a few boxes in the
> next few days, so I'll report back with any interesting issues.
> I'm going to do two things to try and keep hadoop from being a total pig -
> it's jail will be on it's own zfs partition with a quota to prevent it from
> chewing up too much space, and when I put together an rc.d script for it,
> I'll nice down hadoop.
> For the future, there's some disk scheduling stuff coming into 8.1:
> http://wiki.freebsd.org/Releng/8.1TODO
> http://info.iet.unipi.it/~luigi/papers/20090508-geom_sched-slides.pdf<http://info.iet.unipi.it/%7Eluigi/papers/20090508-geom_sched-slides.pdf>
> Charles

Shameless plug here:

I am going to do another hadoop talk.

It is going to be very low level (no powerpoint slides!). I hope some of you
guys can make it.
