[nycbug-talk] shared hosting

Thu Jan 27 23:44:27 EST 2005

On Thu, 27 Jan 2005, G. Rosamond wrote:

> >> It seems to me, as Sunny and I had discussed long ago, that Linux VM
> >> systems really need a Kernel available to the instances more than the
> >> BSD's, because so much is done in Linux through the Kernel- whereas
> >> the converse is true across the BSD's, where much more is
> >> accomplished through userland tools and subsystems (if not userland,
> >> at least subsystems which aren't in the kernel).
> > What in the world are you talking about now? Both Linux and bsd's have
> > about the same separation of kernelspace and userspace. Name one
> > example.
> 
> Woah. . . we can start with the fact that userland for the GNU/Linuxes
> are providing some 200 or whatever interpretations of the kernelspace of
> Linus. . . and the respective BSDs are dealing with the kernel and user
> land in a much more unified manner. . .
What is it with you people? The original statement was that BSD does less
in kernelspace than linux does. That has nothing to do with how many
interpretations (?) Linux userspace has (again, what exactly are you
talking about, interpretations of what?).

Oh, and go count how many BSD' completely *distinct* userspaces are out
there compared to linux. (Hint: fbsd, obsd, netbsd at least. I don't know
how many other splinters appeared last year, picobsd, dragonfly etc who
all have *different* userspaces). At least with linux, everyone sticks to
the original package source.

> >>> I beg to differ. Jail-related code is *all over* the OS. Every time a
> >>> root privilege is checked, a code-writer needs to think whether one
> >>> should also check for 'root but not superuse'.
> >>
> >> Are you absolutely certain that the code you are referring to is not
> >> the foundation for chroot?  In BSD's other than FreeBSD, jail is
> >> often vocabulary used to discribe chrooted processes- (esp. in the
> >> OpenBSD scene).
> >>
> >> I'll be totally happy to stand down corrected, but at this point, I'm
> >> going to need to see some examples in the kernel source- and have
> >> started poking around...
> >>
> >> To my knowledge, after looking up/browsing the source for regular
> >> jail, it's even smaller than I'd thought,
> > Jail is still chroot on steroids, no matter how much BSD people try to
> > claim it isn't.
> 
> ???  To paraphrase your earlier comments, Alex, it's all apples and
> oranges.
> 
> chroot and jails may have started with the same vague goal, but that's a
> bit of a stretch.
> 
> A provider might use chroot to provide services to a provider, but
> providing jails is a whole other level of control that a chroot'd
> program won't provide. . .
What exactly do you mean? Jail is chroot PLUS some stuff. Stuff is
somewhat of a control over network capabilities plus separation of a 
concept of superuser from userid 0. That's all.

> > Also, to me, the fact that jailed user can still see 'ps' for *every*
> > process, not just owned by same user is a big information leak. And if
> > you say that its possible to add code to prevent that leak by having
> > /proc-like-filesystem present different views - well, that's even more
> > jail-related-code that doesn't need to exist.
> 
> 
<snippity>

> This might clear that question up. . . and note this is 4.10, not 5.x.  
> . .
<snippity>

>  From the master box. . .
Even more proof you can't win - now process-list-related-code checks for
jailness. You lose on simplicity. Where else you need to check for
jailness? Does netstat -a show sockets that don't belong to current jail? 
How about IPC between process in jail and out of jail? Should jailed 
process be able to access shm shared memory? Should jailed process be able 
to communicate to sockets in jail-namespace that were opened by non-jailed 
process?

You see, there are a *lot* of places the information could leak out of
jail. All those places need to be protected, meaning more code. All thanks
to the fundamental fact that there is only one kernel for all processes,
jailed or not. That kernel has only one set of data structures. Anything 
to 'virtualize' certain data structures (like virtualizing the "/" 
filesystem location) is fraught with peril because you need to track down 
every place where that data structure is referenced to make sure you won't 
accidentally "get out". Like, making sure .. from / won't get you any 
higher. Making sure you can only see unix-domain-sockets you created, etc, 
list goes on.  One mistake and you are screwed: 
http://www.securiteam.com/unixfocus/5SP0120CAO.html

You are virtualizing *parts* of kernel, instead of virtualizing entire
kernel. Again - it may be what you want, jail provides a reasonable degree
of separation for a "well-behaved and not malicious" user.

The bottom line is, with FreeBSD in order to allow untrusted user in a 
jail, you have to *trust* all of the FBSD kernel. With a VM solution, I 
only have to trust the VM monitor

Thus, to compare complexity, you have to compare complexity of a VM 
monitor (like Xen) with a complexity of full-blown OS.

-alex