Explanation, please!

Hank Dietz hankd at pur-ee.UUCP
Thu Sep 1 07:52:14 AEST 1988


In article <189 at bales.UUCP>, nat at bales.UUCP (Nathaniel Stitt) writes:
> Here is my own personal version of the "Portable Optimized Copy" routine.
.... then he gives a rather verbose, but structured, encoding....

As long as we're getting into structured, portable, hacks, let me suggest
the following two ways of doing block copy:

1.	If the number of items/bytes is known at compile time, then you can
	define a struct type of the appropriate size and use struct assign.
	with type casts to make it fly.  For example, suppose p and q are
	pointers to ints and I want to copy 601 ints from p to q.  Then I
	can write the fast and surprizingly portable:

	struct t601 { int t[601]; };
	*((struct t601 *) q) = *((struct t601 *) p);

	Of course, you do have to watch-out for alignment problems, but
	if your compiler doesn't generate very fast code for this....

2.	If the number of items/bytes is not known, then build a binary tree of
	such structs and copy half, then half of what remains, etc.  This is
	funny looking, but very fast also.  Suppose the number of ints (n) is
	not known at compile time, but can't be more than 601.  You can write:

	struct t512 { int t[512]; };
	struct t256 { int t[256]; };
	struct t128 { int t[128]; };
	struct t64 { int t[64]; };
	struct t32 { int t[32]; };
	struct t16 { int t[16]; };
	struct t8 { int t[8]; };
	struct t4 { int t[4]; };
	struct t2 { int t[2]; };
	if (n & 512) {
		*((struct t512 *) q) = *((struct t512 *) p); q+=512; p+=512;
	}
	if (n & 256) {
		*((struct t256 *) q) = *((struct t256 *) p); q+=256; p+=256;
	}
	if (n & 128) {
		*((struct t128 *) q) = *((struct t128 *) p); q+=128; p+=128;
	}
	if (n & 64) {
		*((struct t64 *) q) = *((struct t64 *) p); q+=64; p+=64;
	}
	if (n & 32) {
		*((struct t32 *) q) = *((struct t32 *) p); q+=32; p+=32;
	}
	if (n & 16) {
		*((struct t16 *) q) = *((struct t16 *) p); q+=16; p+=16;
	}
	if (n & 8) {
		*((struct t8 *) q) = *((struct t8 *) p); q+=8; p+=8;
	}
	if (n & 4) {
		*((struct t4 *) q) = *((struct t4 *) p); q+=4; p+=4;
	}
	if (n & 2) {
		*((struct t2 *) q) = *((struct t2 *) p); q+=2; p+=2;
	}
	if (n & 1) *q = *p;

	Notice that, in this case, n, p, and q should be declared as being
	register variables and that p and q are altered by this routine.  Of
	course, you can copy larger things by making larger power-of-2 sized
	structs.

	Incidentally, this ran about 8x faster (on a VAX 11/780) than using
	the usual copy loop.  Unfortunately, the above code should have been
	written as:

	if (n & 512) {
		*(((struct t512 *) q)++) = *(((struct t512 *) p)++);
	}
	...

	but, for some unknown reason, the VAX C compiler didn't like that.


Enjoy.
					hankd at ee.ecn.purdue.edu



More information about the Comp.lang.c mailing list