Divide and C, Divide and Conquer

Fri Mar 29 05:49:13 AEST 1991

	************************************************************
	|
	|   I recently posted a question concerning divides
	|   and C. My complaint was that one cannot get both the 
	|   result (quotient) and the remainder from the C
	|   '/' operation. Here is a summary of responses.
        |   Thank you all very much, you WERE a help.
        |
	************************************************************

>
>    I get a number. I need to use this number to index into a
>    2 dimensional array.
>
>    EX: int array[10][10];
>        val = 24;
>        ++array[val/10][val%10];
>
>    This is the only way I know how to do this.
>    My problem with it is that is requires TWO divides.
>    
>    Divide 1: val/10. Here the result is used. The remainder
>                      is thrown away.
>
>    Divide 2: val%10. Here the remainder is used, the result is thrown away.

     Since both multiply and subtract are faster (IMHO) machine instructions
than divide, you could always do the following:

      int array[10][10];
      val = 24;
      foo = val/10;
      bar = val-(10*foo);

*OR* you could take advantage of some interesting things by making the
dimensions of your array a power of 2.  IE int array[16][16];

Then, to divide by 16, just right shift val by 4 bits.
To get the remainder, XOR val by 0XF0 (B#11110000).  This increase in
speed would justify the wasted memory - as long as the array is relatively
small (ie, 16X16). And if this operation is in a tight loop,
then you might even be able to justify a larger chunk of wasted memory. ;-)
/*--------------------------------------------------------------------------*\
| Sean C. Malloy  = ("x041sc at tamuts.tamu.edu" || "scm3775 at tamsun.tamu.edu")  |
\*--------------------------------------------------------------------------*/
              You'll pay to know what you REALLY think.

**********************************************************************************

	Well, I think that certain kinds of bit-shifts are equivalent to
dividing by 10...

---
John Gordon
Internet: gordon at osiris.cso.uiuc.edu        #include <disclaimer.h>
          gordon at cerl.cecer.army.mil       #include <clever_saying.h>

**********************************************************************************

>    Wanting my program to run almost twice as fast, I wrote an ASSEMBLY
>    lang subroutine, to be called from C.

Good solution.  If you *really* need speed, sometimes assembly
is the only way to go.

>    Since I do not consider ASSEMBLY to be portable, how do I do
>    this in a C only and portable way?

Yes!  But of course, it won't be as fast as the assembly code.

ASMDIVIDE(dividend, divisor, result, remain)
int dividend, divisor;
int *result, *remain;
{
  *result = dividend/divisor;
  *remain = dividend - divisor * *result;
}

This has a multiply, which are usually faster than divides (but we
still have 2 slow operations).  But at least this is C.

ANSI has a library function called div() that works as you like,
and returns a structure containing the remainder and quotient.
But, as my H&S warns, div() is "found in ANSI C but [is] otherwise
not common in C implementations."

Good luck.  Hope this helped.

-- 
     Brian Scearce (bls at robin.svl.cdc.com  -or-  robin!bls at shamash.cdc.com)
    "Don't be surprised when a crack in the ice appears under your feet..."
 Any opinions expressed herein do not necessarily reflect CDC corporate policy.

***************************************************************************************

Try the div() function.

***************************************************************************************

/*
 * & cc -O4 div.c
 * & time a.out
 * res = 20250000
 *        13.1 real        12.8 user         0.1 sys  
 * & cc -O4 -DONE_DIV div.c
 * & time a.out
 * res = 20250000
 *         9.0 real         8.8 user         0.1 sys  
 */

main()
{
  int array[10][10];
  register int loop, val, res = 0;

  for (loop = 0; loop < 10; loop++) {
	for (val = 0; val < 10; val++) {
		array[loop][val] = loop*val;
	}
  }

  for (loop = 10000; loop; loop--) {
	for (val = 99; val; val--) {
#ifdef ONE_DIV
		register int valdiv10 = val/10;
		register int valdiv5 = 2*valdiv10;

		res += array[valdiv10][val - valdiv5 - 4*valdiv5];
#else
		res += array[val/10][val%10];
#endif
	}
  }
  printf("res = %d\n", res);
}

***************************************************************************************

(You should replace all instances of `result' with `quotient'.)

There is no operator which does what you want, but I have long wished
there was one.  A really smart compiler might optimize out the second
divide, but I don't know of a compiler which does so.  The best solution
would be:

#ifdef ASM_HACK
  ASMDIVIDE (val, 10, &quotient, &remainder);
#else
  quotient = val/10;
  remainder = val%10;
#endif
  ++array[quotient][remainder];

or you could put the / and % in the array reference:

#ifdef ASM_HACK
  ASMDIVIDE (val, 10, &quotient, &remainder);
  ++array[quotient][remainder];
#else
  ++array[val/10][val%10];
#endif

Regards,
Dave Conrad
dave%tygra at sharkey.cc.umich.edu

***************************************************************************************

>    Since I do not consider ASSEMBLY to be portable, how do I do
>    this in a C only and portable way?

ANSI C supports div/ldiv functions which will do what you want. If you
are going to a system without a ANSI C compiler/libraries, it would be
best to have your own version of this routines written in C and or assembly.
This way you can get it ported, then you can make it faster.

+------------------------------------+---------------------------------+
| Geoffrey C. Rogers   		     | "Whose brain did you get?"      |
| grogers at convex.com                 | "Abbie Normal!"                 |
| {sun,uunet,uiucdcs}!convex!grogers |                                 |
+------------------------------------+---------------------------------+

***************************************************************************************

>    EX: int array[10][10];
>        val = 24;
>        ++array[val/10][val%10];

  Assuming this is really what you want, why not take advantage of the
array be contigous in memory? You could then simply say

	++array[0][val];

  Sure, the last index "overflows", but not outside the range of the
full array and the effect should be exactly what you're looking for.
Quite simple, actually ...

  Good luck - Richard

***************************************************************************************
>
>    This is the only way I know how to do this.
>    My problem with it is that is requires TWO divides.
>    
>    [ stuff about dividing and calling assembler routines deleted ]
>    Thanks all.
>

	You don't need to do any divides, because when you reference a
    2d array as

	array[i][j]

    the compiler is referencing the memory location by indexing
    (i*10)+j from the start of your array.  Assuming that

	i <--> val/10
	j <--> val%10

    the index becomes

	((val/10)*10) + (val%10)

    which, happens to be equal to val.

	The following code will save you LOTS of extra divide and
    multiply cycles...

	int	array[10][10];
	int	*another_reference_to_the_same_memory = array;

	Now, instead of doing

	++array[val/10][val%10];

	Use the following

	++another_reference_to_the_same_memory[val];

	No divides, and the compiler will probably generate faster
    code for this array index than a 2d array index.  And exactly the
    same memory location will be incremented (guaranteed portable "C")
    in either case.  This will work in all cases where you are doing
    the divides with the same value as the outermost dimension of your
    2d array.  That is, this will work for _all_ values of some
    constant DIMENSION

	int	array[<any value here>][DIMENSION];
	int	*another_reference_to_the_same_memory = array;

	++array[val/DIMENSION][val%DIMENSION]; 

	is equivalent to

	++another_reference_to_the_same_memory[val]

	Hope this helps...

***************************************************************************************

There are compilers out there that are smart enough to recognize that
the two separate divides can be combined, and only perform one divide.
However, there is a much faster way to do this operation -- table
lookup:

int first[100] = {0,0,0,0,0,0,0,0,0,
		  1,1,1,1,1,1,1,1,1,
			.
			.
		  9,9,9,9,9,9,9,9,9};

inst second[100] = {0,1,2,3,4,5,6,7,8,9,
		    0,1,2,3,4,5,6,7,8,9,
			.
			.
		    0,1,2,3,4,5,6,7,8,9};

----------------------------------------
	val = 24;
	++array[first[val]][second[val]];
----------------------------------------

-- 
	-- Tim Olson
	Advanced Micro Devices
	(tim at amd.com)

***************************************************************************************

>    I happen to know that divide instructions are one of the MOST time
>    consuming instructions around. I measured it.

This is quite common.  Multiplication CAN be done in a single cycle by throwing
enough silicon at it; to date, no one has discovered how to do division without
using iterations, just like human long division.

>    Since I do not consider ASSEMBLY to be portable, how do I do
>    this in a C only and portable way?

In your SPECIFIC example, you can eliminate the divides altogether:

	++ ((int *)(&array [0][0]))[val];

That is, treat your TWO dimensional array as a ONE dimensional array.
Think about the order of storage of your array:

	array [0][0]  [0][1]  [0][2] ... [0][9]
	      [1][0]  [1][1]  [1][2] ... [1][9]

Notice that by concatenating the two dimensions together, you get a nice,
continuous increasing sequence of integers.  The 23rd element of the array
is [2][3].  The 56th element is [5][6].

Hope this helps.

-- 
timr at gssc.gss.com	Tim N Roberts, CCP	Graphic Software Systems
						Beaverton, OR

***************************************************************************************

		SUMMARY:

		Whilst in the middle of re-inventing the binary divide function
		for C, in C, with C, someone pointed out I could/should have been
		doing this a a one-dimensional array. I am glad to hear however
		that ANSI C has recongnized and solved this problem.

		Thanks again to ALL who responded.

		Karl Nicholas

		karln!karln at uunet.uu.net