Portability of some overlapping strcpy or memcpy calls

Michael Meissner meissner at xyzzy.UUCP
Tue Mar 14 01:37:22 AEST 1989


In article <338 at wjh12.harvard.edu> kendall%saber at harvard.harvard.edu (Samuel C. Kendall) writes:
| Consider the following function call:
| 
| 	memcpy(p, p + M, N)
| 
| where p is a char*, M is nonnegative, N is positive, and M < N.  This
| is an overlapping copy, where the bytes are being copied to the left (M
| > 0) or onto themselves (M == 0).  I am interested in finding out if
| this call to memcpy, and similar calls to memccpy, strcpy, and strncpy,
| are portable.

Ok, if you want a real world example, consider systems based on the
Motorola 88000.  The chip has multiple functional units, pipelines,
and hardware interlocks.  When you access memory, there is a minimum
of 3 clock periods after the instruction starts before either the
register is loaded or memory is stored to.  Thus, it is better to do
multiple loads, followed by multiple stores to avoid stalling the
processor.  Thus the inner loop of memcpy would be something like:

loop:	ld	r5,r3,0		; r5 <- *src
	ld	r6,r3,0x4	; r6 <- *(src+4)
	ld	r7,r3,0x8	; r7 <- *(src+8)
	st	r5,r2,0		; store *src into *dest
	st	r6,r2,0x4	; store *(src+4) into *(dest+4)
	st	r7,r2,0x8	; store *(src+8) into *(dest+8)
	addu	r3,r3,0xc	; bump src pointer
	subu	r4,r4,0xc	; decrement length
	bcnd.n	ge0,r4,loop	; loop back if more data to move
	addu	r2,r2,0xc	; bump dest pointer (in delay slot)

Thus if M were 4 or 8, and word aligned moves were done, you would
lose, since the loads and stores are pipelined three deep.  I haven't
looked at the library routine for memcpy recently, but I know the
authors did go out of their way to exploit the parallelism of the
machine.  The above code is roughly what the GNU 88k compiler
currently produces when it knows word alignment is valid, and that the
count is fixed.

I would expect even more striking results on machines with vector
units, since you should be able to make memcpy use the vector
instructions of the machine.
-- 

Michael Meissner, Data General.
Uucp:	...!mcnc!rti!xyzzy!meissner
Arpa:	meissner at dg-rtp.DG.COM   (or) meissner%dg-rtp.DG.COM at relay.cs.net



More information about the Comp.lang.c mailing list