Column-wise data storage for matrices in C, any advantage(s)?

Sat Feb 2 08:43:42 AEST 1991

Greetings,

I recently started thinking whether it is possible to take advantages many
numerical liner algebra algorithms to their fullest extend in C, and if so,
how to come up a clean and hard-limit free implementation of it.

Let's use 2d matrices for discussion for now:

as is well know among numerical computing people that matrix algorithms 
have to be "gaxpy" (defined below) rich to gain high performance:

gaxpy: given x in Rn, y in Rn and A in Rmxn, the following algorithm
       computes z=y+Ax.

       function: z=gaxpy(A,x,y)
             n=cols(A); z=y
             for j=1:n
                 z=z+x(j)A(:,j) 
             end
       end gaxpy

I used Cleve Moler's Matlab language for the above pseudo-code for compactness.

Now, note that this algorithm requires most rapid changes in indixing occuring
in COLUMNS, not rows!

I believe in default C design, data is stored in rows (discussed in K&R II)
instead in columns.  It is almost very common these days people use 
array of pointers to array as a way to simulate 2d matrices so that 
(1) no requirement to declare array dimensions beforehand,
(2) flexibility in passing "matrices" to functions    
(3) allowing index offsetting if that's desired.

However, if a so declared matrix is used in gaxpy, the memory traffic caused
by pointer jumping (figuratively speaking) for a large matrix is significant
and I have a strong suspicision that such traffic would slow down one's code.

Note, however, FORTRAN store data in columns, so gaxpy is perfectly suitable
for codes coded in Fortran and compilied with a good FORTRAN compiler.

Since this group is not for numerical computations, I won't get in depth why
gaxpy type algorithms are much perferred than others.  For details, please
see:

Gene H. Golub & Charles F. Fan Loan, Matrix Computations, 2nd, 1989,
                                     The Johns Hopkins University Press.

The leading author is The no. 1 person in the world now in this field and
is the head of Stanford's CS dept. Numerical Computations group.  BTW, in
his group, Fortran is THE language of choice, though not eveyone likes it.

Please no flame regarding above comments.  I am not interested in C vs. F
I would like to know, assuming the way C stores it's data is a deficiency
from the point of view of numerical computations, can someone come up a 
good, hard-limit free way as flexible as the conventional method yet provides
more efficiency?  Or, if you DON'T think the way data stored in C is a 
deficiency, what is your justifications?  Any hard proofs?

Please email me if possible if you like to respond.  I will, after three
weeks or so, post a summary if there is enough interest.

Let me emphasize again, your rationale, experimental results, hardware
plateforms (CISC or RISC), running conditions (all in core or swapping 
occured), OS used, and a simple example of how you implement a matrix
to make it more efficient than the conventional way mentioned above
all are welcomed, preferrably all included in your response.  I would
imagine it would be quite difficult to judge a claim without all of these
available.

Best Regards and Look forward to hearing from you people

Chin Fang
Mechanical Engineering Department
Stanford University
fangchin at portia.stanford.edu