Emacs hanging on DEC3100, possibly in "rmail"
Anne Louise Gockel
alg at venture.cs.cornell.edu
Wed Aug 1 00:34:39 AEST 1990
A user in our department has a problem that causes emacs to hang regularly
(but not on demand). The problem is possibly associated with using "rmail" in
emacs, possibly with starting a shell in emacs.
If you think you have seen this problem in similar circumstances, please let
me know. I do not know if this problem is unique to the single user or
widespread. If you can shed any light on the problem, please let me know.
Configuration:
DEC3100, Ultrix UWS 2.2, MIT's X11R4 (server, twm, and clients),
emacs 18.55.2 (happened with 18.54 also).
/usr/spool/mail NFS mounted from Sun 4.0 file system
/usr NFS mounted from a Sun 4.0 file system
emacs run from /usr/local, NFS mounted from Sun 4.0 filesystem.
emacs lock files in /tmp, local to DECstation
DECstation is a YP client
emacs compiled with X11 support; it comes up in it's own X window.
Symptoms:
One emacs process starts chewing up CPU (70-80%) and cannot be stopped or
interrupted. A parent emacs process is hung in disk wait. I cannot kill
these processes except with "kill -9". I've tried to get a core dump of them,
but cannot get one that's very meaningful.
The following shows the output of the emacs-related processes. There's a
"ps -auxww" and "ps -clxa" listing. "emacs-debug" is a version of emacs built
with "-g" and no "-O".
It appears that the parent process is the one hung in disk wait and the child
is spinning away (maybe in a spin lock?) This setup does not make sense to
me, is it typical of "rmail" in emacs?
USER PID %CPU %MEM SZ RSS TT STAT TIME COMMAND
rz 18014 79.5 0.7 2172 28 co R 27:49 emacs-debug
rz 18011 0.0 1.4 388 64 p1 I 0:00 /usr/local/lib/emacs/etc/loadst -n 60
rz 18010 0.0 0.0 0 0 co DW 0:00 ???? (emacs-debug)
-----------------------
F UID PID PPID CP PRI NI ADDR SZ RSS WCHAN STAT TT TIME COMMAND
1300c000 442 18010 1 8 -1 0 0 0 0 97e8c DW co 0:00 emacs-debug
12008201 442 18011 18010 1 15 0 7e3 296 28 fc000 I p1 0:00 loadst
2009001 442 18014 18010195 73 0 5921600 20 R co 27:40 emacs-debug
Looking through the mail logs, it is doubtful, but possible, that the user
received mail at the same time as he issued the "rmail" command.
We tried to figure out what the process 18010 was "disk waiting" on. We
figured that it was a NFS file and we tried to track the ethernet packets.
After looking at some of the packets it appeared that the machine was issuing
NFS RFS_READLINK and RFS_GETATTR calls for /usr /usr/spool /usr/spool/mail and
/usr/spool/mail/rz. These seemed to be repeated at regular intervals of a few
seconds.
We have seen NFS caching problems between Suns that are sometimes solved by
unmounting the bad filesystem (even though the umount fails, the cache is
cleared). This trick did not change anything.
If anyone has any insights or has experienced similar problems, please let me
know.
Thanks,
Anne Louise Gockel
Cornell Computer Science
Internet: alg at cs.cornell.edu UUCP: cornell!alg
More information about the Comp.unix.ultrix
mailing list