VAX-11/750 bugs

utzoo!decvax!ucbvax!ihnss!houxi!houxm!houxg!lime!vax135!jfr utzoo!decvax!ucbvax!ihnss!houxi!houxm!houxg!lime!vax135!jfr
Wed Mar 10 11:13:37 AEST 1982


re: VAX-11/750 bugs

The VAX-11/750 has a long (and continuing) history of bugs
in memory management vs. the CALLS instruction.
I have an operating system for the VAX-11 that features
demand paging and the equivalent of TENEX PMAP.
The system has been running on the VAX-11/780 for
about 2 years.  Due to microcode/hardware bugs,
I have had an extremely difficult time getting the
system to run on the VAX-11/750.  Even with the latest
release (my SID registers say 02005E03), I need a
heuristic software patch to correct a microcode error.

An exchange of messages with Bill Munson in February 1982
leaves me with the distinct impression that DEC is not
interested in doing anything effective to fix the bugs
until July 1983, if ever.

Here is a synopsis:

August 1980:  My operating system won't run on Greg
    Chesson's beta-test comet, microcode version <50.
    Dennis Ritchie determines that "CALLS $0,..." with
    a write-protected stack reports the faulting address
    as the contents of PC rather than the contents of SP.
    A new set of ROMs fixes the problem on Greg's machine,
    but the general fix is not promised until level 62,
    and many level-less-than-62 machines are shipped from
    the factory to customers.

July 1981:  I get two 11/750s, level 62.  The fault address
    is now correct, but the fault type word doesn't always
    say "write or modify intent"; usually it's zero ("read").
    After a week of intensive debugging, I produce a stand-
    alone program which gives pure garbage for the fault type
    parameter, and (7/30/81) send the program to Peter Jessel
    and Armando Stettner.  Response is "Yeah, there's a problem,
    we'll look into it."

August 28, 1981:  I send a followup message requesting a
    schedule for the fix.  No response.

Fall 1981:  Jessel leaves DEC; problem languishes.  Same bad
    behavior occurs on other level-62 750s.  Meanwhile
    I find a heuristic which detects and patches around
    the error.  The heuristic has not failed yet, but we
    have a very light load on the 750s.

January 1982:  My 11/750s are upgraded to level 94.  The
    standalone program still bombs in the same way, except
    that the system ID register says 02005E03.

Here is a copy of the message I sent on July 30, 1981:
*****************************************************************
re: 11/750 CALLS on write-protected stack

I am having trouble with the memory management on a VAX-11/750.
The fault parameter word for an access control violation does
not always have bits set as described in the VAX Hardware Handbook
(1980-81 p.76 Fig. 4-17).  In particular, a CALLS instruction
in user mode with zero parameters and with the stack valid but
write-protected, sometimes results in a parameter word of 0
instead of 4.

I include a console transcript with appropriate registers and
memory locations examined.  I also include a standalone program
which can be deposited and run, producing a different and
even more horrible fault parameter word.

John F. Reiser
Bell Laboratories 4F-635
Holmdel, NJ 07733
(201) 949-3942
vax135!jfr
===================================================================
	Console transcript which gets fault parameter word of 0
	for CALLS on readonly stack
-------------------------------------------------------------------
>>>B/1
%%
*unix.uerr2
real mem = 1048576
free mem = 896000
# cat /etc/rc
date >>/dev/console
rm -f /etc/mtab
/etc/mount /dev/rp0h /usr
/usr/lib/ex3.6preserve -a
cd /tmp
rm -f *
cd /
rm -f /usr/spool/uucp/STST.* /usr/spool/uucp/LCK.*
rm -f /usr/spool/lpd/lock
/etc/update&
/etc/cron&
/etc/dzkload >>/dev/console
#			;; <ctrl-D> typed to enter multiuser mode
80000CCE  06		;; the fault in question
>>>E P
      00C00004		;; kernel mode, kernel stack
>>>E/G E
      G      0000000E      7FFFFFF0
>>>E/V 7FFFFFF0		;; the fault parameter words
      P      0002FDF0      00000000		;; should be 00000004
>>>E
      P      0002FDF4      7FFFF510		;; faulting address
>>>E
      P      0002FDF8      00003453		;; pc
>>>E
      P      0002FDFC      03C00004		;; psl
>>>E/I 3E		;; SID register
      I      0000003E      02003EFF		;; 11/750, level 62 microcode
>>>E/I 11		;; SCBB
      I      00000011      00000200
>>>E/P 220		;; access control violation vector
      P      00000220      80000CC8
>>>E/V 80000CC8		;; the fault handler code itself
      P      00000CC8      126E00D1		;; CMPL $0,(SP)
>>>E                                    ;; BNEQ 1$
      P      00000CCC      1AE10001		;; HALT
>>>E                                    ;;1$:
      P      00000CD0      00010CAE
>>>E
      P      00000CD4      AED03FBB
>>>E/I 8		;; current mapping registers
      I      00000008      8001FE00		;; P0BR
>>>E
      I      00000009      00000025		;; P0LR
>>>E
      I      0000000A      7F820000		;; P1BR
>>>E
      I      0000000B      001FFFF7		;; P1LR
>>>E/V 8001FFE0		;; page table for end of P1
      P      0002C5E0      20000000		;; 7ffff000
>>>E
      P      0002C5E4      20000000
>>>E
      P      0002C5E8      FD00015F		;; 7ffff400
>>>E
      P      0002C5EC      FD00017D
>>>E
      P      0002C5F0      E4000181		;; 7ffff800
>>>E
      P      0002C5F4      E4000180
>>>E
      P      0002C5F8      E000017F
>>>E
      P      0002C5FC      E400017E
>>>E/V 7FFFF510		;; the faulting address
      P      0002BF10      20000000
>>>E
      P      0002BF14      00000000
>>>E
      P      0002BF18      20000000
>>>E
      P      0002BF1C      7FFFF584
>>>E
      P      0002BF20      7FFFF55C
>>>E/V 3453		;; code which caused the fault
      P      0002F853      48CF00FB		;; CALLS $0,^W...(pc)
>>>E
      P      0002F857      CF00FBF2
>>>E/I 3		;; USP
      I      00000003      7FFFF52C		;; same page as fault address
>>>

=====================================================================
	Standalone program for producing bad fault parameter word
---------------------------------------------------------------------
#
#	page	contents
#	0	this program
#	1	SCB
#	2	SCB UNIBUS extension
#	3	HALTs
#
	.set PCBB,0x10
	.set SCBB,0x11
	.set SBR,0x0c
	.set SLR,0x0d
	.set MAPEN,0x38
	.set TBIA,0x39

		# p.3 is HALTs
	movc5 $0,(r0),$0,$0x200,*$0x600
		# SCB on p.1
	movab *$0x200,r0
	mtpr r0,$SCBB
		# vectors 000 through 0fc halt at same offset on p.3
	movl $0x100/4,r2
L100:
	movab 0x80000400(r0),(r0)+
	sobgtr r2,L100
		# vectors 100 through 3fc rei
	movl $(0x400-0x100)/4,r2
L200:
	movl $0x80000000+_rei,(r0)+
	sobgtr r2,L200

	nop
	jmp *$0x80000000+ready
ready:
	movl $0x80000000+istack,sp
	mtpr $pcb,$PCBB
	mtpr $sbr,$SBR
	mtpr $4,$SLR
	mtpr $1,$TBIA
	mtpr $1,$MAPEN
	ldpctx
	rei

foo:
	.word 0
	calls $0,foo
	halt

	.align 2
_rei:
	rei

	.align 2
sbr:
	.long 0x90000000	# V KW page0
	.long 0x90000001	# V KW page1
	.long 0x90000002	# V KW page2
	.long 0x90000003	# V KW page3
p0br:
	.long 0xf8000000	# V UR page0
pcb:
	.long 0x80000000+kstack,-1,-1,ustack
	.long 0,0,0,0,0,0,0,0,0,0,0,0,0,0	# r0 through r13(fp)
	.long foo+2,0x03c00000	# pc, psl
	.long 0x80000000+p0br,  0x04000001		# P0
	.long 0x7f800000+p0br+4,0x001fffff	# P1 ontop of P0

	.long 0,0,0,0
istack:

	.long 0,0,0,0
kstack:

	.long 0,0,0,0,0,0,0
ustack:
-----------------------------------------------------------------------
	Execution of above program

>>>I
>>>D/P/L 0 60002C
>>>D + 9F02008F
>>>D + 600
>>>D + 2009F9E
>>>D + DA500000
>>>D + 8FD01150
>>>D + 40
>>>D + E09E52
>>>D + 80800004
>>>D + D0F652F5
>>>D + C08F
>>>D + 8FD05200
>>>D + 8000006C
>>>D + F652F580
>>>D + 3F9F1701
>>>D + D0800000
>>>D + F48F
>>>D + 8FDA5E80
>>>D + 84
>>>D + 708FDA10
>>>D + C000000
>>>D + DA0D04DA
>>>D + 1DA3901
>>>D + 20638
>>>D + EF00FB00
>>>D + FFFFFFF7
>>>D + 0
>>>D + 2
>>>D + 90000000
>>>D + 90000001
>>>D + 90000002
>>>D + 90000003
>>>D + F8000000
>>>D + 80000104
>>>D + FFFFFFFF
>>>D + FFFFFFFF
>>>D + 120
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 61
>>>D + 3C00000
>>>D + 80000080
>>>D + 4000001
>>>D + 7F800084
>>>D + 1FFFFF
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>
>>>E P
      041F0000
>>>S 0

80000621  06
>>>E P
      00C00000
>>>E/G E
      G      0000000E      800000F4
>>>E/V 800000F4
      P      000000F4      00800010		;; pure garbage
>>>E
      P      000000F8      00000108
>>>E
      P      000000FC      00000061
>>>E
      P      00000100      03C00000
>>>E/I 3E
      I      0000003E      02003EFF
>>>E/I 11
      I      00000011      00000200
>>>E/P 200
      P      00000200      80000600
>>>E
      P      00000204      80000604
>>>E
      P      00000208      80000608
>>>E
      P      0000020C      8000060C
>>>E
      P      00000210      80000610
>>>E
      P      00000214      80000614
>>>E
      P      00000218      80000618
>>>E
      P      0000021C      8000061C
>>>E
      P      00000220      80000620
>>>E
      P      00000224      80000624
>>>E/V 80000620
      P      00000620      00000000
>>>E/I 8
      I      00000008      80000080
>>>E
      I      00000009      00000001
>>>E
      I      0000000A      7F800084
>>>E
      I      0000000B      001FFFFF
>>>E/V 80000080
      P      00000080      F8000000
>>>E
      P      00000084      80000104
>>>E/V 108
      P      00000108      00000000
>>>
-----------------------------------------------------------------------
If the user-mode stack pointer in the assembled PCB above is changed
to 0x80000000 and the program is run, I get a correct fault parameter
word of 00000004.



More information about the Comp.unix.wizards mailing list