[nycbug-talk] Jail Performance
Wed Jan 5 10:13:29 EST 2005
Hey Pete, All,
On Jan 5, 2005, at 12:00 AM, George Georgalis wrote:
> On Tue, Jan 04, 2005 at 11:10:19PM -0500, Pete Wright wrote:
>> On Tue, Jan 04, 2005 at 11:01:22PM -0500, Louis Bertrand wrote:
>>> On Tue, 4 Jan 2005, Pete Wright wrote:
>>>> Hey nycbugers,
>>>> I've been kicking around some ideas regarding jailing
>>>> in an "enterprise" environment. While jails do have the obvious
>>>> benefit of added security; one thing that interests me are the
>>>> possibilities of using jails to assist with server and app.
>>>> management in distrubited envrionments. The basic idea I am
>>>> thinking of is creating jails for specific applications that
>>>> get loaded to a farm of servers via PXE-TFTP. One would netboot
>>>> a server, and then dist a jail to that system after boot. Seems
>>>> simple enough...but what about performance. Has anyone noticed
>>>> any significant performance bottlenecks w/in jails. I would not
>>>> expect any, and have not seen any either. But maybe there is
>>>> something I'm missing?
>>> Just a quick thought, and note that I really have no idea what I'm
>>> talking aobut, but didn't you just describe IBM's VM operating
>>> system for mainframes? I think they run multiple independent
>>> instances of Linux, each in its own virtual machine (hence the name).
Yes, it's very similar in concept- (and a lot of these base concepts
are the same from the earliest time-sharing systems), but from my
understanding from the jail side of things, it's manifest in a somewhat
different manner. I believe the IBM VM systems manifest at a lower
level that's tied closer to hardware, (ala hardware memory partitioning
that IBM seems obsessed with for years here etc...)
It's pretty cool stuff, I must admit, but I'm fonder of the jailing
model due to scale of operations I'm involved with- (small). Since
jail(2) is such a simple kernel call, and jail(8) is such a simple
userland call, a lot of the application brawn that IBM puts into
hardware (and oodles of low-level softwares to make use of it) is all
in much higher-level stuff for jailing. i.e. an app developer can
write something in any language to manage jails, even just simple and
solid shell scripts to run things. Therefore I see the uses for
jailing to be much more malliable, less developers can create more
diverse and more flexible systems, on cheaper hardware- something I see
more in-line with the actual needs of many 'enterprise' operations.
All about right tool for the job IMO.
>> yes it is sorta similar to partitioning hardware on IBM or Sun gear,
>> altho what I was thinking about was having a central repository of
>> system images, bundled with a specific app (say an apache tomcat
>> server) that can be distributed to a group a machines. The idea is
>> to make administration easier and allow more flexibility on how one
>> can provision a group of servers.
HECK YEAH. That's what I'm talkin' about! You could ostensibly run
the images following similar practices for running diskless systems
running from read-only drives... (applying principles passed around
from the micro-soekris world lately etc...)
The only thing that's necessary is a solid, fast, and redundant data
storage backend for it all- which for the moment, in the BSD's, seems
somewhat limited to my knowledge (NetApp and the like currently rule
the mass storage scene right...)
> sounds like a good idea, less the ramp up time which no doubt be
> recoverable after a few image mods.
> Speaking from second hand info, and I've been paying a lot of attention
> to these things, I don't think you'll see a performance hit.
Speaking from first-hand and battle-tested experience, George is
absolutely correct here,
> There is
> another layer of abstraction with a jail but the "cpu" doesn't really
> go through it, device IO does. I expect you'll see well under 1% cpu
> degrade, probably closer to 0.1%, and maybe 1% IO degrade. +/- 3% on
> that. ;-)
benchmarks are tough in any context, but I'll nod to these hypothetical
> but seriously, I think any performance hit you'll see with a
> jail will be squelched by the reality of HW cost and Moore's Law, for
> that last 1% you need to buy new hardware every 6 months and if you're
> doing that, you'll have a nice, actual, cluster in no time. :)
> // George
Yep, yep, and yep.
IMO, some of the real caveats and places to focus on would be in how
the systems that manage the jails are setup and run. For example, weak
performance points in many contexts can be:
- starting jails, properly forked or multithreaded applications to
start and manage jails would be really appropriate here- (I did a few
childish experiments hacking with multithreading in Python, and used
Jails startup as the subject for my experiments- in lieu of replacing
shell scripts which started jails in a linear queue...). The jail
mechanism takes hardly any time or resources when starting, but
starting the tree of processes for services the jail is running, can
get a bit time consuming en' masse.
- management systems for jails, some currently nifty, and indispensable
tools for saner jail management are in the ports collection, jtop, jps,
jkill being most valuable to me personally- BUT, these are all
basically Perl wrappers on top of the utilities they mimmic, and
therefore aren't really that efficient when managing large numbers of
jails. In the context of what your talking about, some ports to
something faster, [perhaps even hacking the source of top, ps, and
kill, [kill(8), not kill(2) etc...], could be totally appropriate.)
- network or other centralized filesystem/repository speed and
accessibility (nfs gives me the creeps in this context for various
reasons- would love to see other ways for separate, abstracted, 'disks'
- the ever-present resource based attacks/failures scenario,
- Memory Hogs and Fork Bombs, malicious or app bugs...
- Disk Resource Restrictions
(The stuff that the lower-level rigidity of the IBM stuff aims to solve
in some ways),
The above problems can be solved in various ways, but as it is with
problems of this ilk, it always comes down to a balance between
restriction and rigidity vs. security and stability. For example,
quotas or fixed partitions for jailed system, or even disk images for
the jails can help mitigate disk based resource vulnerabilities, yet
they create new problems in complexity and rigidity of management...
(i.e. when x jail needs more disk space to carry out it's intended
function, this stuff can all become quite cumbersome).
I blabbed a bit there, but in the end, all of it to me is less about
actual resource consumption, Pete and George are both right with the
assumption jailing performance itself is really a moot point, but the
real keys to jailing performance lie in strategies for management.
Balancing the increased complexity that comes with service requirements
and usage contexts changing over time. If the complexity is not
thoughtfully designed and managed, and this increasing complexity is
not anticipated from the jailing management application level,
performance will simply go down the tubes in jockeying systems to cope
with the life-cycle of their use. (But in the end, it is this way with
More information about the talk