[nycbug-talk] Memory sizing

Sun Apr 23 14:46:25 EDT 2006

Francisco,

I reread some of McKusick's book; I think I found the answer for  
buffer vs cache in top.  cache is just another memory pool like  
inactive or free.  It contains pages that still can be referenced to  
something ( process, file, etc) but does not contain any information  
that is backed by disk (i.e. clean).  Buffer is a layer that sits  
between the file system and cache.  When they merged filesystem  
buffering into the vm, they kept a sort of emulation layer so they  
didn't have to rewrite all the filesystems.  This buffer cache  
basically behaves the same way it used to on the filesystem side but  
references the vm system instead of ram directly.

On Apr 23, 2006, at 1:18 PM, Francisco Reyes wrote:

> The 'r' column in vmstat from what I have read is number of  
> processes waiting for CPU and the 'b' column is number of pending  
> transactions waiting to get done.

That is my understanding too.  More distinctly, r is also referred to  
as runqueue, b is processes that are being blocked because they are  
waiting for a resource to free up.

> Also I believe the vmstat numbers depend on some variables that are  
> updated every 5 secons so any number below 5 will not be all that  
> meaninfull. Best explanation of vmstat I have found is Absolute BSD  
> by Michael Lucas (page 432 onward).

"These are averaged each five seconds, and given in units per  
second." -man vmstat, I am assuming this takes the average over the  
past 5 seconds every second.  This is also just for the page option,  
I believe the rest is based on the interval you set for the tool.   
You may not see the extent of a spike, but you should see some  
movement at least.

> Ok I am with you so far.
> Specially I think I mostly get/understand/agree with the meaning of  
> active, inactive and wired.
> If we tally up those numbers:
> 92  Active
> 111 Inact
> 56  Wired
> 13  Cache
> 38  Buf
> 1  Free (rounded)
> ---
> 311

Looking at these numbers, it looks like you are in the same boat as  
me :)  When I reread some parts of the McKusick book, I found that  
Inactive, Cache, and Free has a percentage that the vm tries to  
maintain.  I guess it's not enough to look at your free list and  
assume you need more ram because it's empty.

>> this is pretty much due to vm swapping:
>> bjorn at host=>ps -axo inblock,oublock,comm | sort -n -k 2 | tail -3
>>    325  1152 ntpd
>>      0  8695 bufdaemon
>> 69360 5017287 syncer
>
> Can you explain that a little more please?
> Inblock and outblook is what? The read and written by and app?
> Man page has:
> inblk      total blocks read (alias inblock)
> oublk      total blocks written (alias oublock)

That is my understand of it.  I am not sure if this is specific to  
disk though, it might also be network, or anything else that can be  
opened using an open syscall.  Also, the metric for this is your  
block size I believe, 4096 for instance.  Someone on the list pipe in  
if they know the right metric for this (is it 1024?)

> I will have to readup on those two keywords.
> From man page...
> majflt     total page faults
> minflt     total page reclaims
>
> Is a page fault basically reading from swap?
> Re-reading the vmstat section on absolute BSD as I type this. :)
> I think I am for the most part clear on majflt.
> minflt seems simmilar to 're' in vmstat..
> "shows how many pages have been reclaimed or reused from cache  
> (Absolute BSD page 434).

minfaults are reclaims from inactive and I believe cache as well.

> What is that "cache" referred to by the book?

I believe Michael is referring to inactive and cache.

>> bjorn at host=>ps -axo majflt,minflt,comm | sort -n -k 1 | tail -5
>>      18    394 sshd
>>      19   1746 named
>>      24  18078 python
>>      41    220 httpd
>>     128  22273 httpd
>
> You say your server is mostly "pagin in". Again from Absolute BSD  
> page 434
> pi "Short for pages in, it shows how many pages are moving from  
> physical memory to swap"..

Which flies in the face of McKusick:
pagein: An operation done by the virtual-memory system in which the  
contents of a page are read from secondary storage.
pageout: Ano operation done by the virtual-memory system in which the  
contents of a page are written to secondary storage. (McKusick,  
Neville-Neil.  The Design and Implementation of the FreeBSD Operating  
System. Pearson. 2005. p 635)

> So page in is going into swap.. and major fault is coming out from  
> swap?
> Hm..
> Look at this output from one of my machines:
> ps -axo majflt,minflt,comm | grep -v "   0      0 " | sort -n -k
> .... lines of interest ....
> MAJFLT MINFLT COMMAND
> 106    1717796  mysqld
> 1    142441891 bacula-fd

Pagein is grabbing from swap same as major fault.  Although I don't  
think a pagein will necessarily go to res.  So it looks like bacula- 
fd is active enough to keep from having to reclaim from disk, but not  
active enough to hold on to its pages.

>> minor faults show that my mail and imap servers are reclaiming  
>> from  the inactive memory pool.  These process are probably the  
>> most active  since they don't have a high number of minfaults but  
>> not major faults.

Sorry I meant to say they _do_ have a high number of minfaults but  
not major faults.

> That was going to be my next question. :-)
> Ok.. so those processes above are hitting "Inactive" memory. I wish  
> they had used a different name.. doesn't sound like that memory is  
> inactive at all. :-)
>
> The memory for the machine running bacula is
> Mem: 333M Active, 2269M Inact, 301M Wired, 104M Cache, 112M Buf,  
> 4564K Free
>
> swapinfo
> Device          1K-blocks     Used    Avail Capacity
> /dev/da0s1b       6291456      216  6291240     0%
>
> So basically the Inact pool is what is getting used the most.  
> Specially given all the minfaults and few major faults.. plus swap  
> rarely used.

One thing that is kind of interesting is just to watch vmstat 1 and  
see what happens with your system when you perform the tasks you will  
be doing normally.  When I click my "Get Mail" button on my page, my  
vmsystem jumps:
0 6 0  199560  15360    0   0   0   0   0   0   0 1296  234 354  1  2 98
0 6 0  199560  15360    0   0   0   0   0   0   0 1290  234 344  1  2 98
3 6 0  202744  13744  934   3   6   0 617   0  37 1355 1831 684 42 12 46
1 6 0  203684  12552  314   0   0   0 303   0   5 1449  808 585 86 14  0
0 8 0  202920  11660  212   0   0   0 350   0  72 1388 1994 749 85  8  7
0 8 0  202920  10692    0   0   0   0 242   0  98 1396 2952 823 25  8 67

When you do a query or a backup on your server what do you see in  
vmstat?  Also, what kinds of config are you trying to spec?  DB  
server and backup for the db server, or general backup server?

>> The result of this is that I would probably be fine with a gig of   
>> ram.
>
> Only two pieces of info I didn't see.
> What is the amount of physical memory? What is your "swapinfo"?

My current ram size is 256M and my swap:
bjorn at host=>swapinfo
Device          1K-blocks     Used    Avail Capacity
/dev/ad0s1b        524288      716   524288     0%

>> Francisco, can  you apply this to what you are contending with?
>
> Absolutely!
> In particular you put in content info I had read from the Absolute  
> BSD.
> It is one thing to see explanations and another to see them in  
> context.

Good, this is forcing me to check my facts as well and apply what I  
read from McKusick's book.  I also have to recommend McKusick's video  
courses:
http://www.mckusick.com/courses/index.html

-Bjorn