[nycbug-talk] shared hosting

Thu Jan 27 17:29:24 EST 2005

On Thu, 27 Jan 2005, Isaac Levy wrote:

> Alex- I don't want to make this a confrontational thread, because I find
> this new tech all really interesting- so I'll just disregard the stuff
> that feels disrespectful to me from your last post, and I'm sorry if I
> came off as disrespectful in my last post to you.
Likewise, I usually come off as brash. Don't take it personally. 

<snip>

> On Jan 26, 2005, at 10:05 PM, alex at pilosoft.com wrote:
> >> Jail(2) and Jail(8) combined are much smaller than all the VM systems
> >> I've seen.
> > Of course, because they aren't VM systems.
> 
> I'd argue, for the technical record online with the archive of this
> thread, that Jailing is an kernel/OS-based VM, and that Xen is an
> emulator-based VM.  I say this because various VM schemes are all based
> around 1 idea- providing full OS-level access to mutually untrusted
> uses/processes. Virtual Machine =! Emulator, IMHO.
Jailing is not a VM in any definition of that word. Xen is not an
emulator. Bochs (original Bochs) is an emulator. IBM VM is not an
emulator. It is a VM. You are using wrong language.

> > Xen *is* a virtual machine whose intention is to completely separate
> > mutually untrusted *operating systems!*
> 
> Understood now, and in the context of web services, (something you,
> Sunny, and I all do for a living in various contexts), don't we all meet
> the same end goal with both Xen and jail(8)?
Kinda sorta, but not really. In the same sense, you meet same goal with 
buying two machines and giving each machine to a separate user. So, no, 
you don't. You can't compare.

> >> I really wasn't referring to vmware or bochs, but was alluding to
> >> Jail(8)'ing, which has more to do with the overall history of FreeBSD
> >> itself, (not to shabby a rep for stability/security/performance <g>).
> > Apples to oranges, again. You can't compare xen to jail. You just
> > can't. Different things. Designed to solve different problems.
> 
> Seems like- but then what I'm still wondering is, what are the Xen
> advantages when applied to hosting common internet applications?
Complete separation of control. Xen-based user can upgrade his kernel, can 
run his own init(8), can have his own routing table, etc.

> > If your problem *can* be solved by jail, and you trust jail enough,
> > use jail. Xen is designed to solve a different problem.
> 
> What different problem?!
> 
> I can discuss the implementation for the problem Jailing was designed to
> solve, perhaps somebody could contrast it with Xen's purpose:
See above.

<snip>

> 2. Allow each jail to have it's own superuser administrator whose
> activities are limited to the processes, files, and network associated
> with it's jail'
> 
> The implimentation scope is actually so small it's really only got these
> two stated design goals, (which themselves are
> dissected/discussed/scrubbed fairly extensively).
That's right. Processes and files doesn't make the complete system. And 
network control is only partial with jail.

> >> Do other OS's running unmodified, run in a manner which they will
> >> meet production-level expections for use (espeically in the context
> >> of internet applications)?
> > They *are* modified. Modifications are necessary to gain the highest
> > performance under the virtual machine environment.
> 
> This all makes sense now- but it still doesn't answer my question,
> (though Sunny basically did),
> 
> If I have Xen basics straight, I'm still wondering if the modifications
> are simply a performance-gain issue; do they affect production-level
> expectations (perhaps expectations for use in common internet
> services/applications)?
Well, yes, they do have reliability implications. It is a VM environment. 
There could be bugs in the implementation of VM-related code. How many? We 
don't know.

> It seems to me, as Sunny and I had discussed long ago, that Linux VM
> systems really need a Kernel available to the instances more than the
> BSD's, because so much is done in Linux through the Kernel- whereas the
> converse is true across the BSD's, where much more is accomplished
> through userland tools and subsystems (if not userland, at least
> subsystems which aren't in the kernel).
What in the world are you talking about now? Both Linux and bsd's have 
about the same separation of kernelspace and userspace. Name one example. 

<snip>
> > I beg to differ. Jail-related code is *all over* the OS. Every time a
> > root privilege is checked, a code-writer needs to think whether one
> > should also check for 'root but not superuse'.
> 
> Are you absolutely certain that the code you are referring to is not the
> foundation for chroot?  In BSD's other than FreeBSD, jail is often
> vocabulary used to discribe chrooted processes- (esp. in the OpenBSD
> scene).
> 
> I'll be totally happy to stand down corrected, but at this point, I'm
> going to need to see some examples in the kernel source- and have
> started poking around...
> 
> To my knowledge, after looking up/browsing the source for regular jail, 
> it's even smaller than I'd thought,
Jail is still chroot on steroids, no matter how much BSD people try to 
claim it isn't.

<snip>
> To my knowledge, that's all there is to jailing, so that's under 16 kb
> in source code- (again, vs. the 9.2 mb of Xen- so we're actually at the
> magnitude of about 580 times the code, give or take, for the record.)
Jail-related code is all over kernel. Everywhere you check for superuser, 
you need to check for jailed superuser.

> Now, diving deeper into the source, I can't find the jaill-related code
> which you state is *all-over*.
> 
> I found about 550 references to jail, mostly contained in the files
> mentioned above, and some strewn about in the chroot code, in the source
> for PS, expected places- but haven't taken the time to investigate *too*
> deeply (but did find a cute EPS diagram which grep ripped 'jail' out
> of).
There you go. It's all over. Everywhere kernel needs to check for 
superuser, it needs to check for a jailed superuser.

<snip>

> Well in PS, this makes a heck of a lot of sense, but not from the
> perspective of a hole as you describe, but in the way which jails are
> restricted from particular low-level calls, like the calls ps would make
> to page memory/cpu for processes.
ps(1) does not page memory or CPU. Kernel does. 

Try to get your language correct, like 'system call'.

Yes, in a syscall, kernel needs to verify whether root is a superuser.  
That means at every place where you check for rootness, you need to also
check for jailed root-ness.

> So this patch is to make ps function within the contextual confines of a
> jail, for practical purposes, and trying to modify this would just break
> ps for the jail (an inconvenient problem, for sure)..
No, that patch was just cosmetic to show whether process is jailed or 
not.

There's a bigger issue: ps used to work through direct access of kernel
virtual memory. If you allow that access to a jailed user, you open
yourself for information-leakage attacks. So you don't, and instead make a
separate system call (or a /proc-like filesystem like linux) to export
that information. That's code that might not have been needed if not for
jail.

Also, to me, the fact that jailed user can still see 'ps' for *every* 
process, not just owned by same user is a big information leak. And if you 
say that its possible to add code to prevent that leak by having 
/proc-like-filesystem present different views - well, that's even more 
jail-related-code that doesn't need to exist.

There are similar things that also need to work by directly accessing 
kernel memory. In jail, they won't work. That's not cool.

> <Total sidenote- top does not work in jails out of the box, for the this
> very reason, (paging memory) and this gives me a nice lead to actually
> get around to hacking a patch to make top work in jails- what a
> convienence that'd be... (but talk is cheap, so I digress)...>
Yes, cause top works directly by accessing kernel virtual memory.

> Alex- show me the code.  I'm willing to step down on this issue, but
> currently believe I am correct, and it's worth stating it here to not do
> any disservice to jailing.
Which code? I'm just showing you examples. Just because it doesn't say 
"jail" in the source it doesn't mean it isn't necessitated by jail.

> > Yes, but in Xen, you have to hax0r two environments before you can
> > even *get* to the Host environment!
> 
> That point not understood on this end of the wire, here's why:  If a 
> cracker gets through the Xen enviornment and into the host, how is the 
> host *not* then compromised?  I mean, it's still all the same hardware, 
> still the same system.
Again, you are either not getting it or intentionally misunderstanding my
words. Assume you have a bug in a certain syscall kernel that causes OS to
crash when it is executed with certain parameters. With jail, you are 
immediately screwed. 

> I would additionally argue that a hole *could* be burrowed right through
> the OS instance, (I'd start digging at the kernel mods for Xen modified
> system calls)- and burrow straight into the host OS, precisely the same
> threat model that jail(2) faces.  Layers in between don't matter, big
> picture of this complexity, it's still the same hardware.
> 
> (/me opens the door for Sunny here hoping he'll pipe in with some words 
> on yummy hardware separation tech?)
No it isn't. Xen runs code in a different security ring. (Ring 0, ring 1,
ring 2, ring 3). Read up on intel security contexts. When you are running 
inside a guest OS on Xen, your syscalls are handled by the "guest OS" 
kernel. Guest OS can make a syscall into Xen. Xen can make calls into Host 
kernel. There is no direct way for a guest OS to make a syscall into Host 
kernel. Period.

<snip>
> Can you tell me *why* my abstract opinion presented, is not correct?
> 
> With both Xen or Jail(8), restricting resources, is, restricting
> resources. Or is there something I'm missing here?
Yes, you are. You can't restrict global kernel resources with jail,
because you can't, kernel doesn't account for those (and accounting for it
would take hit on performance and be "more code to maintain"). You cannot
say that jailed user "cannot use more than 512M of *ANY KIND OF MEMORY*".
Not possible.

> >> It is not however necessary in the context of my managed virtualized
> >> servers, that my jails have a kernel- and actually is something I
> >> prefer, in the context of type of app/web development I do.  It's
> >> that simple- and there really isn't anything you've stated here Alex
> >> which constructively changes my preference- (though I'm open to
> >> change my mind if there's something that I'm missing here??)
> > When one of your users starts to continuously locally-ddos your
> > machine, you might change your mind on VMs.
> 
> Do you now mean network-based DDoS over the local network or localhost?
Local DDoS.

>   Would this not be mitigated by running various forms of bandwidth
> shaping and packet filtering, thoughtfully taking into consideration
> communications between abstracted interfaces?  (I've worked in big jail
> clusters which were NAT'd on the host server, routing was very modular,
> controllable, predictable... well worth the added overhead and
> complexity...)
No.

> Ok, so to totally put a halt to the FUD about features here, and after
> reading up on Xen, I don't see any features which are not covered when
> running jails.  Here's why:
> 
> > You are able to use Xen to control damage from resource-based attacks
> > far more effectively than jail - for example, by telling Xen not to
> > allocate more than 256M to a given OS, no matter what. You just can't
> > do that with jail.
> 
> Actually, yes- you can do this with jailed systems, though it's done
> differently, insomuch as the scope of jailing as a whole is totally
> different, it leverages basic, time-tested, expendable, replacable unix
> utilities to cover the features which Xen provides for itself.  There is
> no 'can't do' anything here, basic examples would be:
Jesus christ. I feel that I'm arguing with a GenToo user who just been 
told that FreeBSD is much cooler. You have no understanding of how the 
kernel works.

<snip>
> - dummynet or other for traffic shaping (first implimented in 96')
>    (I don't know if PF is capable of traffic-shaping on FreeBSD?)
> 
> - Quotas, disk partition schemes, or Image-based disk schemes
>    for resource-based controls (The quota command appeared in 4.2BSD)
> 
> And anything else in the OS can be added to this list to control 
> resources in a common manner.
No, you cannot. You can't even reliably control *user* memory allocated by
jailed processes. In addition to that, kernel will allocate memory based
on userspace requests. Kernel memory is not accounted to a specific user.  
It is possible to make kernel starve by having it allocate too much
memory. Each little thing that you do will allocate a kernel resource. I
don't know if kernel accounts for each socket you open and can place
limits on that. Does it account for each file that you open?

> ** SERIOUSLY ** What can Xen do that cannot be done with fundamental
> tools on ANY good UNIX?  I am truly interested here.
Here's one: have each user have their own init and inittab. Have each user 
choose their own freaking kernel.

> --
> Questions/interests I still have about Xen- but perhaps in the next few 
> weeks of reading, I'll learn more about it here:
> 
> >> I really wasn't referring to vmware or bochs, but was alluding to
> >> Jail(8)'ing, which has more to do with the overall history of FreeBSD
> >> itself, (not to shabby a rep for stability/security/performance <g>).
> > Apples to oranges, again. You can't compare xen to jail. You just
> > can't. Different things. Designed to solve different problems.
> 
> Seems like- but then what I'm still wondering is, what are the Xen
> advantages when applied to hosting common internet applications?  What
> does it have over jail(8), feature-wise, or in the fundamental
> differences in use?
If you don't get it by now, you won't. I give up.

> > If your problem *can* be solved by jail, and you trust jail enough,
> > use jail. Xen is designed to solve a different problem.
> 
> What different problem?!  I don't see the different problem, aside from
> kernel-dev abilities? Jail() 'was designed to solve particular SECURITY
> problems in ways chroot did not address', which ended up manifest as a
> Virtual Machine.  It's that simple. What was Xen intended for in it's
> design?
Jail is not a virtual machine. Xen is. I can't talk any more if we don't 
agree on basic terminology.

> So at the end of this email, all I have found as a major difference is
> the *approach* to confining the VM system, not the feaures, and
> certainly better/worse type arguments don't REALLY apply here.  Period.
One is VM, other isn't. 

> In the BSD's, much less is done in the kernel (than is my understanding
> of what is done in Linux kernel)- so from a paradigm perspective, I
> believe we are simply solving similar problem, (Virtualizing an OS to
> provide services to mutually untrusted users), from totally different
> paradigms in UNIX.  That's all.  And I think Xen's side of the solution
> is fascinating- and hope to learn more...
I give up. 

-alex