swap space

Fri Jun 14 05:24:44 AEST 1991

In article <1991Jun13.122803.25362 at ims.alaska.edu> floyd at ims.alaska.edu
(Floyd Davidson) writes:
>[...]
>In article <1991Jun13.065207.10089 at ucunix.san.uc.edu>
>adams at ucunix.san.uc.edu (J. Adams - SunOS Wizard) writes: 

>>I am unsure of one other point:  As I understand it, the total (not the 
>>per-process limit, which is clearly 2.5MB) virtual address space of the 
>>UNIX-PC is 4 megabytes.  If this is in fact the case, increasing the 
>>available swap partition beyond its default maximum of around 4.5 MB 
>>(to allow for alternate blocks, filesystem overhead, etc.) cannot 
>>result in any benefit. 
> 
>Lets say you have 4Mb of swap space.  You run two programs that 
>each take up 1.5Mb of memory.  If you start one more there isn't 
>enough swap space to run it.  If you had 6Mb of swap space you 
>could run 3 processes that needed 1.5Mb of memory (and have some 
>left over...). 
> 
>In practice the only way I've found to bump that limit is with 
>gcc, which will take all the memory it can get.  I'm using 8Mb 
>of swap space.  At this moment 6Mb is free.  I've seen it down 
>to less than 2Mb. (And at the same time the load averages were 
>up around 4.00 and response time was long, and it was time to 
>do something else for a few hours while gcc compiled two  
>different versions of itself at the same time...  Not exactly 
>something you would normally do every day.) 
> 

Based on this experience, the 4MB limit must apply only to physically
addressable RAM and not to virtual memory.  (I assume the physical RAM
limit of 4MB could be increased by hardware modification, but as I
don't have the H/W manual I'll leave that topic be.) Therefore, you
should be able to allocate as much swap space as you can afford by
expanding partition 1 on disk 0, up to a (theoretical :-) maximum of
4GB.  (See below, however...) Unfortunately, the 3B1 kernel has only
one swap device configured, so it won't interleave the swap over two
(or more) drives the way 4.3BSD and later versions of System V can.

In summary:

	Virtual Address Space:  Determined by processor word size and
	MMU design (as in MVS/XA).  32 bits for 68010, so = 2^32 or 4GB.
	Only 3GB on the VAX due to hardware limitations.

	Swap space:  Amount of the above you can actually use at any one
	time, present on one or more reserved disk partitions.  This
	space is mapped into the physical RAM of the machine by the
	kernel.  In "true" System V systems (see below) the available
	space can range from (size of swap) to (size of swap + size of
	physical ram + size of shared text and data) depending on the
	particular state of the system at any given moment.

System V rel. 2.1 and later and 4.1BSD and later adopted different
philosophies regarding the implementation of demand-paged virtual
memory.  The BSD systems, being research-oriented and likely to be used
by programmers with a more intimate knowledge of the inner workings of
the O/S, chose simple, less elegant approaches to memory management with
provisions, such as the vfork() call, for the programmer to tune the
system to the application.  In 4.xBSD, swap space is allocated at the
birth of a process, with enough space being allocated to contain the
entire virtual image of the process in its initial state
(text+data+BSS), excepting of course malloc'd (heap) space obtained
via sbrk() or stack space pushdown expansion, and the user structure and
page tables which are RAM-resident for an active (non-swapped) process.

System V took the approach of being as elegant as possible (e.g., using
copy-on-write and dynamic allocation of all tables except the actual
physical memory map) and hiding the inner workings of the system from
programmer intervention.  In the System V scheme, swap space is
allocated as needed by the page stealer to swap out pages or processes
as needed.  Thus, allocation of swap space is not made until it is
actually used.  Since process sharable text and data may be
demand-paged in from the filesystem, virtual pages may be mapped to one
of three places: physical RAM, swap space, or the executable program
file.

Unfortunately, the UNIX-PC kernel was developed approximately concurrent
with, and relatively independently from, the SVr2.1 kernel.  This means
that the UNIX-PC kernel is a somewhat bastard hybrid of SVr2.0 and
4.1BSD.  In this case, based on examination of the data structures in
the kernal .o files, we have inherited the 'philosophy' of System V (no
vfork() call, etc.) with the memory management algorithm of 4.1BSD
(core-map based memory allocation).

[...]
>
>I think the gripe we have about the way it got put together is
>that limit of 2.3-2.5Mb on per process virtual memory that can't
>be changed.  I've got a 68020 system that also has 4Mb of real
>memory, but with 8Mb of swap it can allocate 11Mb of memory to one
>process (the difference is the kernel etc.).  If I had some horrible
>need to do it it could be set up for 200Mb of swap and allocate
>that much to one process.  Don't laugh too loud, I saw a post once
>where some guy running some kind of modeling program was doing
>exactly that on a Sun.

The DECsystem 5000 this is being posted from has 216MB of swap
configured with 64MB physical RAM.  The Sun 3 I am typing on has 38MB
interleaved over 2 drives with 24MB physical RAM.  With the massive
sizes of some executables (try running a SPICE simulation of a small
microprocessor!) it will not be unreasonable to see systems with the
full 4GB 32-bit address space supported by swap in the near future.

Anyway, the problem with the BSD-derived core-map structure is that it
restricts the size not only of the physical RAM, but also the number
and size of mounted filesystems (including swap space), and the number
of processes and shared text images.  In "true" System V, the
per-process virtual space is fairly easily tunable and is usually
configured based on the size of the pfdata table (physical RAM).  In
our kernel, changing the 2.5MB limit requires a complete rebuild of the
kernel from source to enlarge the coremap structure accordingly.

>>In any case, swapped-out processes MUST reside in the virtual address
>>space of the system, since they are not swapped _in_ as whole processes. 
>>The System V swapping algorithm is not the same as the old Version 7 one
>>where the entire process had to fit in the RAM.  In SysVr2.1 the
>>swapping algorithm is essentially the same as the 4.1 BSD algorithm. 
>>The process must therefore fit in the VIRTUAL address space since it may
>>be paged in rather than swapped in as a whole entity.  Swapping in the
>>sense of V7 UNIX does not occur.
>
>That fits my understanding of what is happening.  That is the
>difference between swapping and paging.

Swapping under SysVr2.1 and later is merely an adaptation of the normal
page-stealer (vhand) daemon in which entire processes are marked
'swapped but ready-to-run' making their entire working set (physical
RAM pages) available for allocation until the memory utilization falls
below the 'high water mark' that triggers swapping.  For each process
so marked, the swapper makes another such process ready to run.  These
processes, however, are NOT 'swapped in' in their entirety but are
allowed to page-fault in just like any normal process.  Both page
stealing and swapping are spawned from process 0.

Under 4.xBSD, page reclamation is initiated by a separate kernel
process, the pagedaemon or process 2.  The swapper is process 0.  The
swapper will become active under several conditions where memory is
critical or a process has slept for over 20 seconds.  Swapped processes
page-fault in as above, but are treated specially by the system.  In
general, swapping under 4.xBSD slows the system abruptly because
the swapper attempts to guess the amount of memory a process being
swapped in will need and will attempt to reserve memory for processes
being swapped in.  In this sense, swapping under 4.xBSD resembles more
closely the V7 process.

>>  In fact, a new process will not be
>>allowed to run if it cannot fit in the virtual space available.  It will
>>be killed in the memory allocation stage.  If you examine the kernal .o
>>files, you will see that the only swapping program is vmswap.c.  This
>>program manages/shares the same virtual address space as vmpage.c.
>
>I don't know one way or the other on this.  Does it kill the new
>process or kill an old process?  I know if existing processes
>ask for more memory and swap space is full it will start killing off
>processes.  But I don't know what the algorithm is.

It is dependent on the system (BSD or SV) and the state of the system. 
In any case, the process requesting swap space at the time is usually
the one that will die.  Under SysV, this could be a process in core
that's being paged out in response to one being demand-paged from disk.
The timing is rather critical.  In the BSD case, it will generally be
the new process since swap space for the entire virtual image is
allocated up front.  On the other hand, if the BSD swapper wakes up, it
attempts to allocate additional swap space for the user structure and
page tables which are normally RAM-resident for any active process.  If
this fails, the swapper will attempt to swap out another process.  If
the memory shortage is critical the system will not allow any processes
other than those currently resident or being swapped in and out to run.

Note that the above is based on published information rather than kernel
source code, so some details may be inconsistent with individual
implementations of the algorithms.

Also, thad at public.BTR.COM (Thaddeus P. Floryan) writes:
#
#In article <1991Jun13.065207.10089 at ucunix.san.uc.edu> adams at ucunix.san.uc.edu (J. Adams - SunOS Wizard) writes:
#>[...] Thus,
#>the notion of swapping to one device and paging to another is
#>impossible.
#
#Not with DEC's VAX/VMS.  There's both a "swap" and a "page" ``file''.  (Gawd,
#I never thought I'd be defending VMS, the penultimate Vomit Making System :-)
#
#>[...]
#>I am unsure of one other point:  As I understand it, the total (not the
#>per-process limit, which is clearly 2.5MB) virtual address space of the
#>UNIX-PC is 4 megabytes.  If this is in fact the case, increasing the
#>available swap partition beyond its default maximum of around 4.5 MB
#>(to allow for alternate blocks, filesystem overhead, etc.) cannot
#>result in any benefit. [...]
#
#Not true.  As I discovered earlier this year while having gcc compile the
#"ephem" program in the background and doing some other "online" emacs and
#gcc work, increasing the swap partition on my HD from the default multi-user
#5MB to 12MB made a BIG difference (those processes simply would NOT run
#before ("Out of swap space"); now they do.)

Another confirmation that swap space on the UNIX-PC _IS_ expandable.

In VMS, the paging algorithm is not 'system-wide' but there are multiple
partitions of memory, each of which has an independent paging daemon.  I
am not very familiar with VMS but would hazard a guess that this is the
reason for the separate files.  This approach allows the sysadmin to
assure that certain classes of programs are always allocated at least a
certain amount of system memory.

In any event, it is also possible to envision a system in which there is
a virtual space of some size(say, 4GB -- 2^32 -- 32 bit addressing) that
is managed by a swapping algorithm which has a mush larger, separate,
swap space from which entire processes are 'swapped' into and out of the
virtual space in the manner of V7 UNIX.  Those processes resident in the
virtual space would themselves be paged in and out of main memory.  In
some respects this is a similar concept to background utilization systems
that run processes on idle CPUs in a network.

-- 
       Jim Adams              Department of Physiology and Biophysics
adams at ucunix.san.uc.edu     University of Cincinnati College of Medicine      
      "I like the symbol visually, plus it will confuse people." 
                                       ... Jim Morrison