DMA Ethernet useless for NFS

John Coolidge coolidge at cassius.cs.uiuc.edu
Tue Mar 27 11:26:20 AEST 1990


liam at cs.qmw.ac.uk (William Roberts) writes:
>To demonstrate this, consider the following version of the
>nhfsstone benchmark: take nhfsstone, give it a mix of
>operations which is 100% fsstat operations (i.e. it never goes
>to disk at all) and compare your servers (based on a Sun 3/60
>client). Nhfsstone tries very hard indeed to get the actual
>server load (in calls per minute) that you ask for, so use a
>Sun 3/60 with 6 worker processes and ask for 1000 calls per
>second.

This is not a fair test of whether DMA is good or not. The reason is
that you're not transferring large amounts of data with this version of
nhfsstone --- you're moving small packets of fsstat information. Of course
you're getting similar numbers --- DMA doesn't win you much when you're
just pushing a few bytes around. The win comes when you start moving large
(1-4K or more) chunks around --- just like NFS does --- and you've got
jobs running besides the ethernet tasks.

What you're showing does demolish a popular misconception about DMA ---
but not the one you think. DMA _does not_ mean that your I/O is _faster_.
What you've established is that a Mac II can generate network traffic
as fast as many other machines, even though it doesn't have DMA ethernet.
That's true, as far as it goes. What it _doesn't_ show --- and what's
important to analysing DMA --- is how much load on the machine is
created by running network traffic. With DMA ethernet, the machine can
go perform real work for someone while waiting on the NFS traffic to be
written into place. With non-DMA, the processor is busy shoving the bytes
around and not doing useful work.

>These numbers say what they ought to say, namely that raw
>processor speed is important for the business of handling,
>decoding and replying to network packets.

Of course, this is true. _However_, it doesn't say anything about whether
DMA is useful in taking load off the processor, because you're not moving
around any major amount of data. What it shows mainly is that the network
packet handling code on A/UX is pretty good, that the Sequent has bad
software (and maybe slow processors?), and that your main limiting case
is mainly the load on your ethernet and the speed at which your hardware
can shove bytes onto the wire. Ethernet is fast, folks, but it's not
_that_ fast --- even with non-DMA ethernet, there's time to do a little
work while the network is busy. But with non-DMA ethernet, the processor
has to do lots more work and the waits get eaten in context switching
time. DMA amounts to "fire-and-forget" --- get someone else to do the
work while the processor goes on to service the user. It's the same reason
that DMA disks are "faster" than the same disks without DMA --- they're
not faster at all (in fast, informal benchmarks suggest that they're the
tiniest bit slower), but the overall system gets more work done.

>So there you have it folks - Mac IIcx Ethernet performance
>under A/UX 1.1.1 is pretty close to that of a Sparcstation.

This isn't saying too much in any case --- SparcStations have pretty
poor I/O.

>It will surprise no-one to know that without DMA SCSI hardware and
>using those ghastly Apple HD 80SC disks and a 1k block System V
>filestore, the performance of a IIcx on the default nhfsstone
>mix of operations is only about 1/5th of that of a Sparcstation.

This only applies if you're using the Mac as a server. We're using them
as clients, and even with an unloaded machine with lots of memory free
(i.e. no paging) the Mac comes out far behind workstations with DMA
ethernet. As perhaps the clincher, the experience of people _with_ DMA
ethernet is that it makes a _huge_ difference in performance.

NFS _is_ disk bound --- but it's disk bound at the server side. At
both the server and the client side, NFS _can_ also be ethernet bound ---
and on the client side it's almost always ethernet bound, unless the
system is paging heavily. If you've got slow ethernet performance, you'll
suffer. If the processor is taking up lots of time shoving bytes off the
network around, other jobs suffer.

>PS. For what it's worth, the X11R3 that our Sun salesman found
>as a part of his bid for 90 Sun 3/80s against 90 Mac IIcx was
>slower on a number of the more important elements of the xperf
>tests - it also seemed faster while you were sitting in front
>of it.

This is quite true. Mac IIcx's are faster than Sun 3/80's, as long as
you're running CPU-bound jobs. There's also, of course, the differences
in code tuning between the Sun X server and the Mac server --- the MIT
people tweaked different things on different platforms.

In any case, trash R3 and get R4 --- it's faster on _both_ :-).

--John

--------------------------------------------------------------------------
John L. Coolidge     Internet:coolidge at cs.uiuc.edu   UUCP:uiucdcs!coolidge
Of course I don't speak for the U of I (or anyone else except myself)
Copyright 1990 John L. Coolidge. Copying allowed if (and only if) attributed.
You may redistribute this article if and only if your recipients may as well.



More information about the Comp.unix.aux mailing list