Raw vs. block device
v.wales%ucla-locus at sri-unix.UUCP
v.wales%ucla-locus at sri-unix.UUCP
Sat Jan 7 05:11:37 AEST 1984
From: Rich Wales <v.wales at ucla-locus>
Jonathan --
Here is an attempt on my part to describe "block" and "raw" I/O in as
much detail as reasonably possible. If I have inadvertently made some
misstatement, or left out some important feature, I trust one of the
other "veterans" on this list will correct me.
UNIX has two kinds of device interfaces: "block", and "character" (also
called "raw"). I'll discuss here the "raw" interface first, since it is
the "lower-level" of the two, and since virtually all devices with block
interfaces will have a raw interface as well.
RAW (CHARACTER) DEVICE INTERFACE
Generally speaking, the "raw" interface to a device gives you direct
control over that device. If you do a "read" system call on a disk
via the "raw" interface, for example, you will generally invoke a
single input operation on that disk to read your data. (There may
be exceptions here; for example, I once wrote a "raw" device driver
for an RX02 floppy disk, and since this device can read or write
only one sector at a time, I implemented long "read" or "write" re-
quests via multiple I/O commands to the drive.)
Raw I/O is "synchronous": I/O operations are always done in the
order requested. There can never be more than one raw I/O request
pending per device. In 4.1BSD, this restriction is generally imple-
mented by having the driver declare a single "buf" structure per
device for all raw I/O on that device. All raw I/O for the device
goes through a routine called "physio" (in dev/bio.c); "physio" in
turn checks and manipulates a "busy" status bit in the "buf" struc-
ture, using the kernel's "sleep"/"wakeup" facility to force requests
on a busy "buf" structure to wait.
Raw I/O is generally subject to any requirements imposed by the
hardware itself. For example, if a given disk demands (as most do)
that all I/O operations start on a sector boundary and comprise an
integral number of full sectors, then you must observe this restric-
tion when doing raw I/O on that disk.
If you do try to read/write random amounts of data at random places
on a disk via a raw interface, you are likely to get unpredictable
results. (In particular, a misaligned "write" is liable to trash
innocent data.) If the driver is well written and checks for this
situation, you may get an explicit error, but you shouldn't in gen-
eral depend on this. This, by the way, is why you can't use "adb"
on a raw device.
In the case of my RX02 driver which I mentioned earlier, by the way,
I chose to implement multi-sector "read"s and "write"s as a conve-
nience to the user. I could have forbidden them (because the RX02
hardware doesn't support them) and have been perfectly within the
philosophy of raw I/O interfaces by so doing. My driver still re-
quired all transfers to start on sector boundaries and comprise an
integral number of full sectors, though -- and I explicitly tested
for violations of this constraint before doing the I/O.
Raw I/O on terminal lines is somewhat complicated by the use of the
"clist" mechanism (see sys/prim.c). Hence, terminal I/O may be to
some extent asynchronous, even though a "raw" interface is in use.
BLOCK DEVICE INTERFACE
The block interface (if one exists) to a device goes through a com-
plicated buffering/caching scheme. A number of buffers (each one
1024 bytes long in 4.1BSD, or 512 bytes long in Version 7) are allo-
cated by the kernel for block I/O. Each buffer is labelled with the
device (major/minor) and block numbers, so that repeated references
to the same block do not result in actual "read" operations if the
block is already in main memory.
Each buffer has a "dirty" bit, so that the data is not written back
to disk immediately upon the issuance of a "write" system call.
Data is written back when the buffer is needed for another block
(LRU caching strategy); when a "sync" system call is issued by a
process; or when a block device is closed and (if it was mounted)
unmounted.
A "block" driver interface to a device is free to perform I/O opera-
tions in any order it sees fit -- not necessarily the order in which
"read" or "write" system calls were issued. (Hence, while raw I/O
is "synchronous", block I/O is "asynchronous".) Most disk drivers
use a queue of pending I/O requests for each drive, sorted in order
by cylinder so as to allow the disk arm to sweep back and forth
across the surface in "elevator" fashion. In a "raw" interface, on
the other hand, there is no need for a queue of pending requests,
since by definition only one raw I/O request can ever be pending for
any given device.
The buffering scheme allows you to do I/O with arbitrary byte off-
sets and byte counts, even if the device itself does not support
such access. For example, if you want to write a single byte in the
middle of a block using the block interface, the kernel will read in
the entire block and then change the single byte in question. An
I/O operation which spans multiple blocks (perhaps starting in the
middle of one block and ending in the middle of another) is handled
in a similar fashion.
The block I/O mechanism is used by the routines which implement reg-
ular file I/O, needless to say.
WHICH DEVICES ARE BLOCK? WHICH DEVICES ARE RAW?
In general, every device will have a raw interface. Additionally,
a device on which it would make sense to put a file system (i.e.,
disks) will generally have a block interface. Most tape drivers
also have a block interface, although I have never had occasion to
access a tape by anything but the raw interface.
If you are doing a "dd" (byte-for-byte copy) of a large area of disk
(say, for example, that you are moving a file system from one part
of the disk to another), you should probably use the raw interface,
since it is far more efficient than the block interface. In partic-
ular, large block sizes in "dd" can generally be handled by the raw
disk interfaces, whereas the block interface will cut a large trans-
fer down into 1K-byte chunks.
Terminals have only a raw interface. Also, such "funny" files as
/dev/null and /dev/kmem are implemented via raw interfaces. (Of
course, you can still do I/O on /dev/kmem from random offsets and
with random byte counts, since memory does not have the alignment
restrictions that a disk does.)
DEVICE SPECIAL FILES AND RAW VS. BLOCK I/O
There are two kinds of device special files in UNIX: raw and block.
The major device number of a device special file is associated with
a set of device-driver routines via one of two tables in dev/conf.c:
"cdevsw" ("c" = "character" = "raw") for raw devices, and "bdevsw"
("b" = "block") for block devices. In particular, note that there
is no necessary relationship whatsoever between "raw" major device
number N and "block" major device number N.
I hope this covers your question adequately. If not, let me know and I
will try and supply additional information.
-- Rich <v.wales at UCLA-LOCUS>
More information about the Comp.unix
mailing list