TECH: Unix Virtual Memory, Paging & Swapping explained (Doc ID 17094.1)

rongshiyuan發表於2013-12-25

TECH:  Unix Virtual Memory, Paging & Swapping explained (Doc ID 17094.1)
**********************************************************************************************
*** This article was correct at the time it was written (1999-2000) but is now out-of-date ***
**********************************************************************************************

====================================================================
Understanding and measuring memory usage on UNIX operating systems.
====================================================================

When planning an Oracle installation, it is often necessary to plan for
memory requirements.  To do this, it is necessary to understand how the
UNIX operating system allocates and manages physical and virtual memory
among the processes on the system.

------------------------------
I.  Virtual memory and paging
------------------------------

Modern UNIX operating systems all support virtual memory.  Virtual
memory is a technique developed around 1961 which allows the size of a
process to exceed the amount of physical memory available for it.  (A
process is an instance of a running program.)  Virtual memory also
allows the sum of the sizes of all processes on the system to exceed
the amount of physical memory available on the machine.  (Contrast this
with a system running MS-DOS or Apple MacIntosh, in which the amount of
physical memory limits both the size of a single process and the total
number of simultaneous processes.)

A full discussion of virtual memory is beyond the scope of this
article.  The basic idea behind virtual memory is that only part of a
particular process is in main memory (RAM), and the rest of the process
is stored on disk.  In a virtual memory system, the memory addresses
used by programs do not refer directly to physical memory.  Instead,
programs use virtual addresses, which are translated by the operating
system and the memory management unit (MMU) into the physical memory
(RAM) addresses.  This scheme works because most programs only use a
portion of their address space at any one time.

Modern UNIX systems use a paging-based virtual memory system.  In a
paging-based system, the virtual address space is divided up into
equal-sized chunks called pages.  The actual size of a single page is
dependent on the particular hardware platform and operating system
being used: page sizes of 4k and 8k are common.  The translation of
virtual addresses to physical addresses is done by mapping virtual
pages to physical pages.  When a process references a virtual address,
the MMU figures out which virtual page contains that address, and then
looks up the physical page which corresponds to that virtual page.

One of two things is possible at this point: either the physical page
is loaded into RAM, or it is on disk.  If the physical page is in RAM,
the process uses it.  If the physical page is on disk, the MMU
generates a page fault.  At this point the operating system locates the
page on disk, finds a free physical page in RAM, copies the page from
disk into RAM, tells the MMU about the new mapping, and restarts the
instruction that generated the page fault.

Note that the virtual-to-physical page translation is invisible to the
process.  The process "sees" the entire virtual address space as its
own: whenever it refers to an address, it finds memory at that
address.  All translation of virtual to physical addresses and all
handling of page faults is performed on behalf of the process by the
MMU and the operating system.  This does not mean that taking a page
fault has no effect.  Since handling a page fault requires reading the
page in from disk, a process that takes a lot of page faults will run
much slower than one that does not.

In a virtual memory system, only a portion of a process's virtual
address space is mapped into RAM at any particular time.  In a
paging-based system, this notion is formalized as the working set of a
process.  The working set of a process is simply the set of pages that
the process is using at a particular point in time.  The working set of
a process will change over time.  This means that some page faulting
will occur, and is normal.  Also, since the working set changes over
time, the size of the working set changes over time as well.  The
operating system's paging subsystem tries to keep all the pages in the
process's working set in RAM, thus minimizing the number of page faults
and keeping performance high.  By the same token, the operating system
tries to keep the pages not in the working set on disk, so as to leave
the maximum amount of RAM available for other processes.

Recall from above that when a process generates a page fault, the
operating system must read the absent page into RAM from disk.  This
means that the operating system must choose which page of RAM to
use for this purpose.  In the general case, there may not be a free
page of physical RAM, and the operating system will have to read the
data for the new page into a physical page that is already in use.  The
choice of which in-use page to replace with the new data is called the
page replacement policy.

Entire books have been written on various page replacement policies and
algorithms, so a full discussion of them is beyond the scope of this
article.  It is important to note, however, that there are two general
classes of page replacement policy: local and global.  In a local page
replacement policy, a process is assigned a certain number of physical
pages, and when a page fault occurs the operating system finds a free
page within the set of pages assigned to that process.  In a global
page replacement policy, when a page fault occurs the operating system
looks at all processes in the system to find a free page for the
process.

There are a number of key points to understand about paging.

(1) Typically, only a relatively small number of pages (typically 10% -
50%) of a single process are in its working set (and therefore in
physical memory) at any one time.

(2) The location of physical pages in RAM bears no relation whatever to
the location of pages in any process's virtual address space.

(3) Most implementations of paging allow for a single physical page to
be shared among multiple processes.  In other words, if the operating
system can determine that the contents of two (or more) virtual pages
are identical, only a single physical page of RAM is needed for those
virtual pages.

(4) Since working set sizes change over time, the amount of physical
memory that a process needs changes over time as well.  An idle process
requires no RAM; if the same process starts manipulating a large data
structure (possibly in response to some user input) its RAM requirement
will soar.

(5) There exists a formal proof that it is impossible to determine
working set sizes from a static analysis of a program.  You must run a
program to determine its working set.  If the working set of the
program varies according to its input (which is almost always the case)
the working sets of two processes will be different if the processes
have different inputs.

---------------------------
II. Virtual memory on Unix
---------------------------

The discussion above of virtual memory and paging is a very general
one, and all of the statements in it apply to any system that
implements virtual memory and paging.  A full discussion of paging and
virtual memory implementation on UNIX is beyond the scope of this
article.  In addition, different UNIX vendors have implemented
different paging subsystems, so you need to contact your UNIX vendor
for precise information about the paging algorithms on your UNIX
machine.  However, there are certain key features of the UNIX paging
system which are consistent among UNIX ports.

Processes run in a virtual address space, and the UNIX kernel
transparently manages the paging of physical memory for all processes
on the system.  Because UNIX uses virtual memory and paging, typically
only a portion of the process is in RAM, while the remainder of the
process is on disk.

1) The System Memory Map

The physical memory on a UNIX system is divided among three uses.  Some
portion of the memory is dedicated for use by the operating system
kernel.  Of the remaining memory, some is dedicated for use by the I/O
subsystem (this is called the buffer cache) and the remainder goes into
the page pool. 

Some versions of UNIX statically assign the sizes of system memory, the
buffer cache, and the page pool, at system boot time; while other
versions will dynamically move RAM between these three at run time,
depending on system load.  (Consult your UNIX system vendor for details
on your particular version of UNIX.)

The physical memory used by processes comes out of the page pool.  In
addition, the UNIX kernel allocates a certain amount of system memory
for each process for data structures that allow it to keep track of
that process.  This memory is typically not more than a few pages.  If
your system memory size is fixed at boot time you can completely ignore
this usage, as it does not come out of the page pool.   If your system
memory size is adjusted dynamically at run-time, you can also typically
ignore this usage, as it is dwarfed by the page pool requirements of
Oracle software.

2)  Global Paging Strategy

UNIX systems implement a global paging strategy.  This means that the
operating system will look at all processes on the system when is
searching for a page of physical memory on behalf of a process.  This
strategy has a number of advantages, and one key disadvantage.

The advantages of a global paging strategy are:  (1) An idle process
can be completely paged out so it does not hold memory pages that can
be better used by another process.  (2) A global strategy allows for a
better utilization of system memory; each process's page allocations
will be closer to their actual working set size.  (3) The administrative
overhead of managing process or user page quotas is completely
absent.  (4) The implementation is smaller and faster.

The disadvantage of a global strategy is that is is possible for a
single ill-behaved process to affect the performance of all processes
on the system, simply by allocating and using a large number of pages.

3)  Text and Data Pages

A UNIX process can be conceptually divided into two portions; text and
data.  The text portion contains the machine instructions that the
process executes; the data portion contains everything else.  These two
portions occupy different areas of the process's virtual address
space.  Both text and data pages are managed by the paging subsystem.
This means that at any point in time, only some of the text pages and
only some of the data pages of any given process are in RAM.

UNIX treats text pages and data pages differently.  Since text pages
are typically not modified by a process while it executes, text pages
are marked read-only.  This means that the operating system will
generate an error if a process attempts to write to a text page.  (Some
UNIX systems provide the ability to compile a program which does not
have read-only text: consult the man pages on 'ld' and 'a.out' for
details.) 

The fact that text pages are read-only allows the UNIX kernel to
perform two important optimizations:  text pages are shared between all
processes running the same program, and text pages are paged from the
filesystem instead of from the paging area.  Sharing text pages between
processes reduces the amount of RAM required to run multiple instances
of the same program.  For example, if five processes are running Oracle
Forms, only one set of text pages is required for all five processes.
The same is true if there are fifty or five hundred processes running
Oracle Forms.  Paging from the filesystem means that no paging space
needs to be allocated for any text pages.  When a text page is paged
out it is simply over-written in RAM;  if it is paged in at a later
time the original text page is available in the program image in the
file system.

On the other hand, data pages must be read/write, and therefore cannot
(in general) be shared between processes.  This means that each process
must have its own copy of every data page.  Also, since a process can
modify its data pages, when a data page is paged out it must be written
to disk before it is over-written in RAM.  Data pages are written to
specially reserved sections of the disk.  For historical reasons, this
paging space is called "swap space" on UNIX.  Don't let this name
confuse you: the swap space is used for paging.

4) Swap Space Usage

The UNIX kernel is in charge of managing which data pages are in RAM
and which are in the swap space.  The swap space is divided into swap
pages, which are the same size as the RAM pages.  For example, if a
particular system has a page size of 4K, and 40M devoted to swap space,
this swap space will be divided up into 10240 swap pages.

A page of swap can be in one of three states: it can be free, allocated,
or used.  A "free" page of swap is available to be allocated as a disk
page.  An "allocated" page of swap has been allocated to be the disk
page for a particular virtual page in a particular process, but no data
has been written to the disk page yet -- that is, the corresponding
memory page has not yet been paged out.  A "used" page of swap is one
where the swap page contains the data which has been paged out from RAM.
A swap page is not freed until the process which "owns" it frees the
corresponding virtual page.

On most UNIX systems, swap pages are allocated when virtual memory is
allocated.  If a process requests an additional 1M of (virtual) memory,
the UNIX kernel finds 1M of pages in the swap space, and marks those
pages as allocated to a particular process.  If at some future time a
particular page of RAM must be paged out, swap space is already
allocated for it.  In other words, every virtual data page is "backed
with" a page of swap space.

An important consequence of this strategy is if all the swap space is
allocated, no more virtual memory can be allocated.  In other words,
the amount of swap space on a system limits the maximum amount of
virtual memory on the system.  If there is no swap space available, and
a process makes a request for more virtual memory, then the request
will fail.  The request will also fail if there is some swap space
available, but the amount available is less than the amount requested.

There are four system calls which allocate virtual memory: these are
fork(), exec(), sbrk(), and shmget().  When one of these system calls
fails, the system error code is set to EAGAIN.  The text message
associated with EAGAIN is often "No more processes".  (This is because
EAGAIN is also used to indicate that the per-user or system-wide
process limit has been reached.)  If you ever run into a situation
where processes are failing because of EAGAIN errors, be sure to check
the amount of available swap as well as the number of processes.

If a system has run out of swap space, there are only two ways to fix
the problem: you can either terminate some processes (preferably ones
that are using a lot of virtual memory) or you can add swap space to
your system.  The method for adding swap space to a system varies
between UNIX variants: consult your operating system documentation or
vendor for details.

5) Shared Memory

UNIX systems implement, and the Oracle server uses, shared memory.  In
the UNIX shared memory implementation, processes can create and attach
shared memory segments.  Shared memory segments are attached to a
process at a particular virtual address.  Once a shared memory segment
is attached to a processes, memory at that address can be read from and
written to, just like any other memory in the processes address space.
Unlike "normal" virtual memory, changes written to an address in the
shared memory segment are visible to every process that has attached to
that segment.

Shared memory is made up of data pages, just like "conventional"
memory.  Other that the fact that multiple processes are using the same
data pages, the paging subsystem does not treat shared memory pages any
differently than conventional memory.  Swap space is reserved for
a shared memory segment at the time it is allocated, and the pages of
memory in RAM are subject to being paged out if they are not in use,
just like regular data pages.  The only difference between the
treatment of regular data pages and shared data pages is that shared
pages are allocated only once, no matter how many processes are using
the shared memory segment.

6) Memory Usage of a Process

When discussing the memory usage of a process, there are really two
types of memory usage to consider: the virtual memory usage and the
physical memory usage. 

The virtual memory usage of a process is the sum of the virtual text
pages allocated to the process, plus the sum of the virtual data pages
allocated to the process.  Each non-shared virtual data page has a
corresponding page allocated for it in the swap space.  There is no
system-wide limit on the number of virtual text pages, and the number
of virtual data pages on the system is limited by the size of the swap
space.  Shared memory segments are allocated on a system-wide basis
rather than on a per-process basis, but are allocated swap pages and
are paged from the swap device in exactly the same way as non-shared
data.

The physical memory usage of a process is the sum of the physical text
pages of that process, plus the sum of the physical data pages of that
process.  Physical text pages are shared among all processes running
the same executable image, and physical data pages used for shared
memory are shared among among all processes attached to the same shared
memory segment.  Because UNIX implements virtual memory, the physical
memory usage of a process will be lower than the virtual memory usage.

The actual amount of physical memory used by a process depends on the
behavior of the operating system paging subsystem.  Unlike the virtual
memory usage of a process, which will be the same every time a
particular program runs with a particular input, the physical memory
usage of a process depends on a number of other factors. 

First: since the working set of a process changes over time, the amount
of physical memory needed by the process will change over time.
Second: if the process is waiting for user input, the amount of
physical memory it needs will drop dramatically.  (This is a special
case of the working set size changing.)  Third: the amount of physical
memory actually allocated to a process depends on the overall system
load.   If a process is being run on a heavily loaded system, then the
global page allocation policy will tend to make the number of physical
memory pages allocated to that process to be very close to the size of
the working set.  If the same program is run with the same input on a
lightly loaded system, the number of physical memory pages allocated to
that process will tend to be much larger than the size of the working
set:  the operating system has no need to reclaim physical pages from
that process, and will not do so.

The net effect of this is that any measure of physical memory usage
will be inaccurate unless you are simulating both the input and the
system load of the final system you will be testing.  For example, the
physical memory usage of a Oracle Forms process will be very different
if a user is rapidly moving between 3 large windows, infrequently
moving between the same three windows, rapidly typing into a single
window, slowly typing into the same window, or if they are reading data
off of the screen and the process is sitting idle -- even though the
virtual memory usage of the process will remain the same.  By the same
token, the physical memory usage of an Oracle Forms process will be
different if it is the only active process on a system, or if it is one
of fifty active Oracle Forms processes on the same system.

7) Key Points

There are a number of key points to understand about the UNIX virtual
memory implementation.

(1)  Every data page in every process is "backed" by a page in the swap
space.  This size of the swap space limits the amount of virtual data
space on the system;  processes are not able to allocate memory if
there is not enough swap space available to back it up, regardless of
how much physical memory is available on the system.

(2)  UNIX implements a global paging strategy.  This means that the
amount of physical memory allocated to a process varies greatly over
time, depending on the size of the process's working set and the
overall system load.  Idle processes may be paged out completely on a
busy system.  On a lightly loaded system processes may be allocated
much more physical memory than they require for their working sets.

(3)  The amount of virtual memory available on a system is determined
by the amount of swap spaces configured for that system.  The amount of
swap space needed is equal to the sum of the virtual data allocated by
all processes on the system at the time of maximum load.

(4)  Physical memory is allocated for processes out of the page pool,
which is the memory not allocated to the operating system kernel and
the buffer cache.  The amount of physical memory needed for the page
pool is equal to the sum of the physical pages in the working sets of
all processes on the system at the time of maximum load.

----------------------------------
III. Process Memory Layout on UNIX
----------------------------------

1) The Segments of a Process

The discussion above speaks of a UNIX process as being divided up into
two regions: text and data.  This division is accurate for discussions
of the paging subsystem, since the paging subsystem treats every
non-text page as a data page.  In fact, a UNIX process is divided into
six segments: text, stack, heap, BSS, initialized data, and shared
memory.  Each of these segments contains a different type of information
and is used for a different purpose.

The text segment is used to store the machine instructions that the
process executes.  The pages that make up the text segment are marked
read-only and are shared between processes that are running the same
executable image.  Pages from the text segment are paged from the
executable image in the filesystem.  The size of the text segment is
fixed at the time that the program is invoked: it does not grow or
shrink during program execution.

The stack segment is used to store the run-time execution stack.  The
run-time program stack contains function and procedure activation
records, function and procedure parameters, and the data for local
variables.  The pages that make up the stack segment are marked
read/write and are private to the process.   Pages from the stack
segment are paged into the swap device.  The initial size of the stack
segment is typically one page;  if the process references an address
beyond the end of the stack the operating system will transparently
allocate another page to the stack segment. 

The BSS segment is used to store statically allocated uninitialized
data.  The pages that make up the BSS segment are marked read/write,
are private to the process, and are initialized to all-bits-zero at
the time the program is invoked.  Pages from the BSS segment are paged
into the swap device.   The size of the BSS segment is fixed at the
time the program is invoked: it does not grow or shrink during program
execution.

The initialized data segment is used to store statically allocated
initialized data.  The pages that make up the initialized data segment
are marked read/write, and are private to the process.  Pages from the
initialized data segment are initially read in from the initialized
data in the filesystem; if they have been modified they are paged into
the swap device from then on.   The size of the initialized data
segment is fixed at the time the program is invoked: it does not grow
or shrink during program execution.

The dynamically allocated data segment (or "heap") contains data pages
which have been allocated by the process as it runs, using the brk() or
sbrk() system call.  The pages that make up the heap are marked
read/write, are private to the process, and are initialized to
all-bits-zero at the time the page is allocated to the process.  Pages
from the heap are paged into the swap device.  At program startup the
heap has zero size: it can grow arbitrarily large during program
execution.

Most processes do not have a shared data segment.  In those that do,
the shared data segment contains data pages which have been attached to
this process using the shmat() system call.  Shared memory segments are
created using the shmget() system call.  The pages that make up the
shared data segment are marked read/write, are shared between all
processes attached to the shared memory segment, and are initialized to
all-bits-zero at the time the segment is allocated using shmget().
Pages from the shared data segment are paged into the swap device.
Shared memory segments are dynamically allocated by processes on the
system:  the size of a shared memory segment is fixed at the time it is
allocated, but processes can allocate arbitrarily large shared memory
segments.

2)  Per-Process Memory Map

The six segments that comprise a process can be laid out in memory in
any arbitrary way.  The exact details of the memory layout depend on
the architecture of the CPU and the design of the particular UNIX
implementation.  Typically, a UNIX process uses the entire virtual
address space of the processor.  Within this address space, certain
addresses are legal, and are used for particular segments.  Addresses
outside of any segment are illegal, and any attempt to read or write to
them will generate a 'Segmentation Violation' signal. 

The diagram below shows a typical UNIX per-process virtual memory map
for a 32-bit processor.  Note that this memory map covers the entire
virtual address space of the machine.  In this diagram, regions marked
with a 't' are the text segment, 's' indicates the stack segment, 'S'
the shared memory segment, 'h' the heap, 'd' the initialized data, and
'b' the BSS.  Blank spaces indicate illegal addresses.

+--------+-----+--------+----+---------------------+-------+----+----+
|tttttttt|sssss|        |SSSS|                     |hhhhhhh|dddd|bbbb|
|tttttttt|sssss| ->>    |SSSS|                 < |tttttttt|sssss|        |SSSS|                     |hhhhhhh|dddd|bbbb|
+--------+-----+--------+----+---------------------+-------+----+----+
0                                                                   2G

In this particular implementation, the text segment occupies the lowest
virtual addresses, and the BSS occupies the highest.  Note that memory
is layed out in such a way as to allow the stack segment and the heap
to grow.  The stack grows "up", toward higher virtual addresses, while
the heap grows "down", toward lower virtual addresses.  Also note that
the placement of the shared memory segment is critical: if it is
attached at too low of an address it will prevent the stack from
growing, and if it is attached at too high of an address it will
prevent the heap from growing.

3) Process size limits

All UNIX systems provide some method for limiting the virtual size of a
process.  Note that these limits are only on virtual memory usage:
there is no way to limit the amount of physical memory used by a
process or group of processes. 

On systems that are based on SVR3, there is a system-wide limit on the
virtual size of the data segment.  Changing this limit typically
requires you to change a UNIX kernel configuration parameter and relink
the kernel: check your operating system documentation for details.

On systems that are based on BSD or SVR4, there is a default limit on
the size of the stack segment and the data segment.  It is possible to
change these limits on a per-process basis; consult the man pages on
getrlimit() and setrlimit() for details.  If you are using the C-shell
as your login shell the 'limit' command provides a command-line
interface to these system calls.  Changing the system-wide default
typically requires that you change a UNIX kernel configuration
parameter and relink the kernel: check your operating system
documentation for details.

Most systems also provide a way to control the maximum size and number
of shared memory segments: this typically involves changing the UNIX
kernel parameters SHMMAX, SHMSEG and SHMMNI.  Again, consult your
operating system documentation for details.

4) The High-Water-Mark Effect

Recall from above that the size of the data segment can only be changed
by using the brk() and sbrk() system calls.  These system calls allow
you to either increase or decrease the size of the data segment.
However, most programs, including Oracle programs, do not use brk() or
sbrk() directly.  Instead, they use a pair of library functions
provided by the operating system vendor, called malloc() and free().

These two functions are used together to manage dynamic memory
allocation.  The two functions maintain a pool of free memory (called
the arena) for use by the process.  They do this by maintaining a data
structure that describe which portions of the heap are in use and which
are available.  When the process calls malloc(), a chunk of memory of
the requested size is obtained from the arena and returned to the
calling function.  When the process calls free(), the
previously-allocated chunk is returned to the arena making it available
for use by a later call to malloc().

If a process calls malloc() with a request that is larger than the
largest free chunk currently in the arena, malloc() will call sbrk() to
enlarge the size of the arena by enlarging the heap.  However, most
vendor's implementations of free() will not shrink the size of the arena
by returning memory to the operating system via sbrk().   Instead, they
simply place the free()d memory in the arena for later use.

The result of this implementation is that processes which use the
malloc() library exhibit a high-water-mark effect:  the virtual sizes
of the processes grow, but do not shrink.  Once a process has allocated
virtual memory from the operating system using malloc(), that memory
will remain part of the process until it terminates.  Fortunately, this
effect only applies to virtual memory;  memory returned to the arena is
quickly paged out and is not paged in until it is re-allocated via
malloc(). 

-------------------------
IV. Monitoring Memory Use
-------------------------

In the final analysis, there are only two things to be concerned with
when sizing memory for a UNIX system: do you have enough RAM, and do
you have enough swap space?  In order to answer these questions, it is
necessary to know how much virtual memory and how much physical memory
each process on the system is using.  Unfortunately, the standard UNIX
process monitoring tools do not provide a way to reliably determine
these figures.  The standard tools for examining memory usage on a UNIX
system are 'size', 'ipcs', 'ps', 'vmstat' and 'pstat'.  Most
SYSV-derived systems will also have the 'crash' utility: most
BSD-derived systems will allow you to run 'dbx' against the UNIX
kernel.

The 'size' utility works by performing a static analysis of the program
image.  It prints out the virtual memory size of the text, BSS and
initialized data segments.  It does not attempt to determine the size
of the stack and the heap, since both of these sizes can vary greatly
depending on the input to the program.  Since the combined size of the
stack and the heap is typically several hundred times larger than than
the combined size of the BSS and the initialized data, this method is
the single most unreliable method of determining the runtime virtual
memory requirement of a program.  It is also the method used in the ICG
to determine memory requirements for Oracle programs.  The one useful
piece of information you can obtain from 'size' is the virtual size of
the text segment.  Since the text segment is paged from the filesystem,
knowing the virtual size of the text segment will not help you size
either swap space or RAM.

The 'ipcs' utility will print out the virtual memory size of all the
shared memory segments on the system.  Use the '-mb' flags to have it
print the size of the segments under the SEGSZ column.

The 'ps' utility will print out information about any process currently
active on the system.  On SYSV-based systems, using 'ps' with the '-l'
will cause 'ps' to print out the SZ field, which contains the virtual
size of the process's non-text segments, measured in pages.  On
BSD-based systems, using 'ps' with the '-u' flag will also cause the SZ
field to be printed.  While this figure is an accurate measure of the
virtual memory being used by this process, it is not accurate if the
process has attached a shared memory segment.  This means that when
sizing memory, you must subtract the size of the SGA (obtained via
'ipcs', above) from the virtual memory used by all of the Oracle
background and shadow processes.

On SVR4-based and BSD-based systems, using the BSD-style 'ps' command
with the '-u' flag will also cause the RSS field to be printed.  This
field contains the physical memory usage for the process.
Unfortunately, this value is the combined physical memory usage for all
the segments of the process, and does not distinguish between pages
private to the process and pages shared between processes.  Since text
and shared data pages are shared between processes, this means that
adding up the RSS sizes of all processes on the system will
over-estimate the amount of physical memory being used by the system.
This also means that if you add up the RSS fields for all the processes
on the system you may very well come up with a number larger than the
amount of RAM on your system!  While the RSS field is a good indicator
of how much RAM is required when there is only one process running a
program image, it does not tell you how much additional RAM is required
when a second process runs that same image.

The 'pstat' utility is also used to print per-process information.  If
it has a SZ or RSS field, the same limitations that apply to 'ps'
output also apply to 'pstat' output.  On some versions of UNIX, 'pstat'
invoked with a flag (typically '-s' or '-T') will give you information
about swap space usage.  Be careful!  Some UNIX versions will only
print out information about how much swap space that is used, and not
about how much has been allocated.  On those machines you can run out
of swap, and 'pstat' will still tell you that you have plenty of swap
available.

The 'vmstat' utility is used to print out system-wide information on
the performance of the paging subsystem.  Its major limitation is that
it does not print out per-process information.  The format of 'vmstat'
output varies between UNIX ports: the key fields to look at are the
ones that measure the number of page-in and page-out events per
second.  Remember that some paging activity is normal, so you will have
to decide for yourself what number of pages-in or pages-out per second
means that your page pool is too small.

On SYSV-based systems, the 'sar' utility is used to print out
system-wide information on the performance of a wide variety of kernel
subsystems.  Like 'vmstat', its major limitation is that it does not
print out per-process information.  The '-r', '-g', and '-p' options
are the most useful for examining the behavior of the paging subsystem.

On SYSV-based systems, the 'crash' utility lets you directly examine
the contents of the operating system kernel data structures.  On
BSD-based systems, it is usually possible to use a kernel debugger to
examine these same data structures.  These data structures are always
hardware- and operating system-specific, so you will not only need a
general knowledge of UNIX internals, but you will also need knowledge of
the internals of that particular system.  However, if you have this
information (and a lot of patience) it is possible to get 'crash' to
give you precise information about virtual and physical memory usage on
a per-process basis.

Finally, there are a variety of public domain and vendor-specific tools
for monitoring memory usage.  Remember: you are looking for a utility
that lets you measure the physical memory usage of a process, and which
gives you separate values for the number of pages used by the text
segment, the shared memory segment, and the remainder of the process. 
Consult your operating system vendor for details.

----------------------------
V. Sizing Swap Space and RAM
----------------------------

The bottom line is, that while it is possible to estimate virtual and
physical memory usage on a UNIX machine, doing so is more of an art
than a science. 

First:  you must measure your actual application.  An Oracle Forms
application running in bitmapped mode, using 256 colors, 16 full-screen
windows, and retrieving thousands of records with a single query may
well use two orders of magnitude more stack and heap than an Oracle
Forms application running in character mode, using one window and only
retrieving a few dozen rows in any single query.  Similarly, a
server-only system with five hundred users logged into the database but
only fifty of them performing queries at any one time will have a far
lower RAM requirement than a server-only system which has only two
hundred users logged into the database all of which are continually
performing queries and updates.

Second: when measuring physical memory usage, make sure that your
system is as heavily loaded as it will be in a production situation.
It does no good to measure physical memory usage with  255 processes
running Oracle Forms if all 255 processes are sitting idle waiting for
input -- all of the processes are paged out waiting for input.

Sizing swap space is relatively easy.  Recall that every page of
virtual data must be backed with a page of swap.  This means that if
you can estimate the maximum virtual memory usage on your machine, you
have determined how much swap space you need.  Use the SZ column from
the 'ps' command to determine the virtual memory usage for the
processes running on the system.  The high-water mark can be your ally
in this measurement: take one process, run it as hard as you can, and
see how high you can drive the value of the SZ column. 
Add together the virtual memory used by the system processes to form
a baseline, then calculate the maximum amount of virtual memory used
by each incremental process (don't forget to count all processes that
get created when a user logs on, such as the shell and any dedicated
shadow processes).  The swap space requirement is simply the sum of the
SZ columns of all processes at the time of maximum load.  The careful
system administrator will add 10% to the swap space size for overhead
and emergencies.

Sizing RAM is somewhat more difficult.  First, start by determining the
amount of RAM dedicated for system space (this is usually printed in a
message during startup).  Note that tuning the operating system kernel
may increase the amount of RAM needed for system space.

Next, determine the amount of RAM needed for the buffer cache. 

Finally, determine the amount of RAM needed for the page pool.  You
will want to have enough RAM on the system so that the working set of
every active process can remain paged in at all times. 

--------------
VI. References
--------------

`Operating Systems Design and Implementation'
  Andrew S. Tannenbaum, Prentice-Hall, ISBN 0-13-637406-9
`The Design and Implementation of the 4.3BSD Unix Operating System',
  Samuel Leffler, Kirk McKusick, Michael Karels, John Quarterman,
  1989, Addison-Wesley, ISBN 0-201-06196-1
`The Design of the Unix Operating System', Maurice Bach, 1986,
  Prentice Hall, ISBN 0-13-201757-1
`The Magic Garden Explained: The Internals of Unix System V Release 4',
  Berny Goodheart, James Cox, 1994, Prentice Hall, ISBN
  0-13-098138-9.
















來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/17252115/viewspace-1064305/,如需轉載,請註明出處,否則將追究法律責任。

相關文章