problems with 2 drives under 386/ix 2.0.1, and a TCP/IP problem

Wed Nov 15 04:59:13 AEST 1989

Hi folks,

I've got a couple of interesting 386/ix problems that I've not been
able to solve, so I thought I'd throw them out here to see if
anyone else has seen them, and if so, how they were solved.

1. I've got ISC 386/ix 2.0.1 running on the following hardware:

    25 MHz 386 clone with 387
    8 MB DRAM
    Adaptec ACB 2312 ST-506 MFM floppy/hard disk controller
    Miniscribe 6085
    Miniscribe 3053
    single 1.2 MB, 5-1/4" floppy disk drive
    BTC VGA graphics controller
    Logitech serial mouse on /dev/tty00
    modem on /dev/tty01

The /dev/dsk/0s? partitions reside on the 6085, while the
/dev/dsk/1s? partitions are on the 3053.  X-Windows and TCP/IP
are also installed, but I believe they're irrelevent to this
discussion.  The disk partitioning and filesystems are set up
approximately as follows:

    /dev/dsk/0s1     /       ~30 MB
    /dev/dsk/0s3     /usr    ~30 MB
    /dev/dsk/1s1     /tmp    ~5 MB
    /dev/dsk/1s3     /usr2   ~35 MB

/dev/swap also lives on the 6085.  The second disk was installed
using sysadm, and it's partitions are automatically mounted at
boot-time.

Now the problem.  If I execute a normal shutdown with
/etc/shutdown, the next time I bring the system up, the first
execution of /etc/dfspace causes the system to hang completely,
not even echoing characters.  If I don't execute dfspace, the
system will apparently run indefinitely, and seemingly normally;
however, dfspace is not the only command which can cause the
hang.  `Pack' has also caused it on at least one occasion, and I
expect that other programs could also do the same thing, although
I haven't found any.  After a crash, once the system's back up
(after fsck'ing all of its filesystems) dfspace, pack, and
everything else run perfectly.

I have tried a number of things to get around this.  Manually
sync'ing and umounting the file systems before shutting down
doesn't improve the situation, nor does executing fsck on each
power-up.  I expect that what's happening is that some vital
information about the filesystems is not being updated during the
umount prior to shutdown.  This problem has not been observed
with just a single disk attached to the controller.  Anyone have
any ideas?  Right now, the safest way to shut the system down is
for me to sync the disk buffers and then powerdown; fsck puts
everything back in order after the boot, and the system runs
reliably.  I'm not too happy about doing that though.

The second problem involves TCP/IP.  I've got a second system
running 386/ix 1.0.6.  This system is a 20 MHz 386 with 387 and 10
MB DRAM.  Both systems run host-based TCP/IP, v1.1.2 on the 2.0.1
system and v1.0.3 on the 1.0.6 system.  Both systems are using
Western Digital WD8003E ethernet cards.  To perform backups, we
have a Wangtek tape controller driving an Archive FT-60E tape
drive attached to the 1.0.6 system.  To backup the 2.0.1 system,
I use the following command (executed from the 2.0.1 system):

   find . [...] | cpio -oc | \
     rsh node_name 'compress | dd ibs=1024k obs=1024k of=/dev/tape'

This frequently causes hangs on one or the other of the systems,
in which all system activity ceases (character echoing included).
I've played around with the number of dblocks, which changes how
early the hang occurs, but not ultimately whether it occurs at
some time.  Sometimes I can back up two or three partitions with
no problems, but if I keep doing it long enough, I eventually get
a hang.

This is the current dblock configuration on the 2.0.1 system:

		 alloc	 inuse	   total     max    fail
dblock class:
    0 (   4)	   128	     0	  345380       3       0
    1 (  16)	   128	    30	   41906      32       0
    2 (  64)	   128	    17	  214582     115      26
    3 ( 128)	   128	   108	   25676     115    9001
    4 ( 256)	   128	     0	   11969       8       0
    5 ( 512)	   256	     0	    3820       3       0
    6 (1024)	    32	     0	    2315       4       0
    7 (2048)	    16	     0	    1906       5       0
    8 (4096)	     8	     0	      16       1       0

The configuration on the 1.0.6 system is similar.  I've had
the number of buffers increased such that I get no failures
(that I've observed) in any class, but to no avail.  I've
also got 600 disk buffers allocated, and 300 clists.
Significantly increasing either the number of clists or the
number of dblocks seems to hasten the onset of the crash,
contrary to my intuitive expectations.

Sorry I've gone on so long, but I wanted to give accurate
information about these problems.  Any pointers to solutions to
either of these problems will be greatly appreciated.

thanks in advance,
mike borza.
-- 
Michael Borza              Antel Optronics Inc.
(416)335-5507              3325B Mainway, Burlington, Ont., Canada  L7M 1A6
work: mike at antel.UUCP  or  uunet!utai!utgpu!maccs!antel!mike
home: mike at boopsy.UUCP  or  uunet!utai!utgpu!maccs!boopsy!mike