an rm question

Guy Harris guy at gorodish.Sun.COM
Sat Apr 16 15:31:18 AEST 1988


> >I managed to create a file whose name contained bizarre characters; in
> >fact, they were so bizarre that rm * wouldn't remove them (oh well,
> >porting arc is a nasty business), I got a nonexistent file message.
> >Anyway, since the directory they were in was junk I went up one dir
> >and did an 'rm -r' on the directory, and that worked. So, being the
> >curious sort, I went to the BSD sources and took a look; however, I
> >can't figure out why the rm -r worked when the other didn't.
> 
> Here is the SVR3 explanation (although it should be the same):
> 
> When you use '*', the shell expands it into filenames and then splits them
> up via delimiters (such as space) to send to an exec() system call.  On
> filenames with bizarre chars in them the split is done incorrectly.
> 
> When you use 'rm -r', I think (I haven't looked at the sources lately) that
> the file names are taken directly out of the directory file so the shell
> expansion problems do not occur.

Half right.

The fact that it does work with "rm -r" is as stated above.

The fact that it *doesn't* work with "rm *" has nothing to do with the reason
stated above.  The problem is that, in most versions of various UNIX shells,
the 8th bit is used internally to indicate quoted characters.  This bit is then
stripped off of all arguments before they are passed to programs.  This means
that if you have a file named "\305ngstrom" (that's (A-with-a-ring)ngstrom, in
ISO Latin #1) in the current directory, it will be read as "\305ngstrom" but
converted to "Engstrom" by the shell.  "rm" won't find "Engstrom" (unless
you're *really* unlucky) and will complain.

When you use "*", the shell does expand it into filenames.  However, since it
does this expansion by reading the directory in which the files occur and
matching each filename it finds against the pattern, there's no need for it to
split the filenames up; they're already split up in the directory!  As such,
the shell does not split the filenames up; try, for instance:

	gorodish$ >"foo bar"
	gorodish$ >"bletch mumble"
	gorodish$ rm -i *b*
	rm: remove bletch mumble? y
	rm: remove foo bar? y

It has no problems with file names including spaces.

The "SVR3 explanation" is "this doesn't happen under SVR3".  The S5R3 Bourne
shell doesn't use the 8th bit for quoting, and doesn't strip them off.  This
comes in handy for removing various test files named "Citro\:en" that I
occasionally create (where '\:e' is the ISO Latin #1 "e with a diaresis"
character, octal 353) or creating the symlink "/UNIX(R)" I have to "/vmunix
(where '(R)' is the ISO Latin #1 "registered trademark" character, octal 256).
(Unfortunately, we don't have the latest internationalized Korn shell in-house
yet, so I can't use my *normal* shell for doing this.  Grumble, grumble....)



More information about the Comp.unix.wizards mailing list