Moving C from DOS to UNIX, ANSI mistakes

bookmark marks at lcc.la.Locus.COM
Thu Sep 7 08:15:33 AEST 1989


<---- bug snack

>In article <1287 at calvin.EE.CORNELL.EDU>, richard at calvin.EE.CORNELL.EDU (Richard Brittain) writes:
>> ...I learned C on a pc using Turbo-C, and now I'm trying to write in C
>> on a unix box (BSD and Ultrix) but nothing works!!!!!.  All of my pc source
>> gives multitudinous errors under unix...
>> ...Function prototypes seem to give cc a fit, and...
>> I also get a lot of miscellaneous errors and warnings like "warning: old
>> fashioned initialization" that I cannot make sense of.  I'd be really 
>> grateful if anyone could give any general rules of thumb for converting
>> between the two environments.
>
Then in article <1116 at virtech.UUCP>, Conor P. Cahill writes:
>My *guess* would be that the turbo-c compiler is much more ANSI compliant
>than the older compilers used on your BSD/Ultrix systems.  If you really
>need to be portable accross these environments I would develop the software
>on the BSD/Ultrix system and then port it to turbo-c. This gets you writing
>the software at the "least common denominator" level of compiler.  An ANSI
>compiler should not have too much trouble compiling software generated
>under an older (K&R 1st Ed) compiler since that was part of thier mandate.
>

Good advice and correct I think, but incomplete.

The truth is that the X3.159 Committee blew off the compatability
goal in several places.  The worst of these mistakes, even though
the rationale explains (I'm looking at an October '88 draft) that
"existing code is important" was that they changed the unsigned
conversion rules away from K&R without good cause (they said:
"this is considered the most serious semantic change made by the
Committee..." and I agree).

You should develop your code using only K&R features for maximum
portability.  But, do not rely on the semantics of arithmetic
involving unsigned vars, especially where you are trying to
assign values from one size of int to another (for example, the
assignment "u_int = (unsigned) short" will likely NOT do what
you intend (on a VAX).  It is not possible to prevent ANSI C from
sign-extending your variable without using a cast that would truncate
your value if you later lengthened the type of the variable.



I should point out that my objection to the new rule is that
it absolutely prevents writing simple portable code which moves
from one similar environment to another (say, one POSIX system
to another) if any of the variables used are externally specified,
like POSIX structure members are.  Consider the POSIX stat()
call which fills in an externally specified (in stat.h) struct
stat which has members st_dev and st_ino which may be of any
integral type (via dev_t and ino_t) (yes, my friends, there
are UN*X systems with 32-bit inode numbers).  If you want to
convert these values to unsigned for some reason, you'll have
to write either painful or non-portable code.  See this (drawn
from the real world with minor mods and elisions) example:


/*
 * We use double hashing...
 * Some trickery is used in converting dev/inode pairs
 * to hash keys.  We expect that most dev and inode values will
 * be <= 32K; so we can come up with a single long key pretty
 * easily.  However, we don't want to blow any other significant
 * bits off completely, so we rotate the dev value by 15 bits and
 * XOR it with the inode value.  In the assumed usual case, this
 * will preserve all the bits from the dev/inode pair, in less
 * usual cases at least some influence will be felt from each bit.
 */

>>>>> In the original code, I rotated the dev value by 16 bits,  <<<<<
>>>>> but for this example I didn't want the problems with       <<<<<
>>>>> unsigned conversion overshadowed by the fact I happened    <<<<<
>>>>> to be working with shifts of half- or double-word lengths. <<<<<


#ifndef __STDC__ /* assume K&R1-style unsigned conversion */

/* struct stat *s; */

/* note that only unsigned is guaranteed 0-filled right shifting
 * and that sign-extension of short st_dev or st_ino before
 * conversion to u_long would be undesirable.  This code works
 * for st_dev and st_ino of any integral type (even char).
 */
#define HKEY(s)	(((unsigned long)(s)->st_dev << 15) \
		| (((unsigned long)(s)->st_dev>>(LONGBITS-15))&0x7fffL) \
			^ (unsigned long)(s)->st_ino)

#else /* __STDC__ */
	/* The new unsigned conversion rules are stupid because
	 * they inhibit rather than promote the writing of portable
	 * code.  Since you must know the length of a thing before
	 * you can convert it to unsigned without using non-portable
	 * masks, painful and unnecessary computation, or this sort
	 * optional gobbledy-gook in the source it is damned near
	 * impossible to prepare code which is both portable and
	 * efficient.  The code below is not portable, dammit (but
	 * at least it'll work on our current systems).
	 */

#ifdef SHORT_DEV_INO /* if "typedef short dev_t, ino_t;" */
  /* perhaps could be:
   * #if (sizeof(dev_t)==sizeof(short)) && (sizeof(ino_t)==sizeof(short))
   */

/* struct stat *s; */

#define HKEY(s)	(((long)(unsigned short)(s)->st_dev << 15) \
		| (((unsigned long)(unsigned short)(s)->st_dev \
						>> (LONGBITS-15)) & 0x7fffL) \
			^ (long)(unsigned short)(s)->st_ino)

#else /* dev_t and ino_t are longs */

	/* if we used the SHORT_DEV_INO macro here
	 * it would discard half of our bits!
	 */

#define HKEY(s) (((long) (s)->st_dev << 15) \
		| (((unsigned long) (s)->st_dev >> (LONGBITS-15)) & 0x7fffL) \
			^ (long) (s)->st_ino)

#endif /* long dev_t, ino_t */

	/* Note that we're blowing off the case of
	 *	typedef short dev_t;
	 * 	typedef long ino_t;
	 * or the reverse and all cases involving char types.
	 */

#endif /* __STDC__ */


Ahem.  Where was I?  Oh, yeah.  ANSI C source considered by itself
may be "portable" but in the real world where it interacts with
external stuff like POSIX or MS-DOS or VAX/VMS... it would be nice
if things with the same name, and similar types, could be used in
portable expressions to get the same results.  The K&R unsigned
rule ("unsigned always wins") provides this, the lousy X3.159
rule (sign extend before converting to unsigned) does NOT.

(I would think any discussion about whether K&R1 provides "unsigned
long" belongs in another thread (I think K&R1 allows u_long but I
admit the possibility of argument).  The X3.159 Committee could just
as well have adopted the correct unsigned conversion rules and the
ANSI C language does have unsigned long.)


Mark Seecof, Locus Computing Corp., Los Angeles, (213) 337-5218.
My opinions only, of course...



More information about the Comp.lang.c mailing list