Compressing C programs

Richard T. Ferris ferris at eniac.seas.upenn.edu
Thu Jan 11 00:34:12 AEST 1990


Here is my original question and the replies I received.
Thanks to all for their assistance.

>I am interested in learning how to reduce the size of my TurboC
>programs.  The .exe files are 13K even for very simple programs.
>Could someone send me some suggested references?  Thanks.
>
>RF
>Richard T. Ferris
>ferris at eniac.seas.upenn.edu
>University of Pennsylvania

*From: josephc at tybalt.caltech.edu (Joseph Chiu)
     
Well, 13K is about the best size you will get for most situations.  You can
start reducing code size, first of all, but not including the floating-point
library (switch off the [O]ptions-[C]ompiler-[C]ode-[F]loat).  Also, by
setting the compile switches to optimize for size, and to not include debugging
information, and to not to a stack overrun check, you can eliminate some more.
If you don't need the printf statements, and can settle with just puts or putc,
I suspect you can reduce your .EXE size even more... 

What you might want to do is to find out what library functions are linked
into your program and to see how much space they take up.  Maybe you can
find alternatives to some of the functions.


*From: jb at altair.csustan.edu

Use of printf scanf and their cousins sprintf, fprintf, vprintf ...
drag in the floating point lib.

Use of any of the stdio file operations drag in the _IOB structures
for stdin sdtout etc.

I realized approximately 6 k reduction on code size for a Microsoft C 
program by eliminating references to these.


*From: Mark Streich <boulder!streich at boulder.Colorado.EDU>

Have you turned off all of the debugging switches?  Have you told the
compiler to optimize for Size?  If you're doing Floating Pt. math, you may
be including the Fl. Pt. emulator which is used if you don't have a 80x87
chip.


*From: Steve Resnick  <stever at Octopus.COM>

It depends on the memory model you are using and what library routines get called.
Turbo C will, in small model, reserve space for the stack in the .EXE (I think)
A lot of library routines are BIG (printf, scanf, etc.) The larger mem models
C,M,H tend to be smaller in .EXE size and runtime size because of the way they
allocate stack and heap space. (I have a 7K .EXE which is in large model and was
200 linees of code...) 


*From: mnetor!lethe!tvcent!andrew at uunet.uu.net

One thing you could try is compiling in the tiny memory model (I don't think
that you can use small - chexk the manuals), Then convert it to a .COM file
with the DOS program EXE2BIN. (I also seem to recall that there is an option to
tlink that generates .COM files when requested in tiny automatically. This may
be another compiler?)


*From: Tom Wilson  <wilson at uhccux.uhcc.Hawaii.Edu>

One way to cut down .EXE size:  if your program can get by without printf
and its relatives, in MSC I can save about 5K.  Thus, if your program is
a utility that can read/write using low-level (read/write/seek) i/o, and
write to the screen with the DOS calls (INT 21H, fns 2/6/7 etc.) then you
can cut out some.  But to save programming time, I wouldn't do a lot a 
recoding just to save 4-5K.  After all:  min size in MS Pascal is approx.
26K, MS Fortran 40K, and Clipper is 160K for "hello, world".

If you are writing a bunch of similar utilities, you can save space more
easily by combining them and using command line switches.  13K may be the
min, but you can cram a lot of code into 20K.

Also, disable all checking once the program is debugged:  stack overflow, etc.


*From: bobc at attctc.Dallas.TX.US (Bob Calbridge)

Well, something close to what I've discovered lately.  The first thing I've
ever done, when using the Tiny (and possibly the Small) memory models is to
use the exe2bin utility to convert it to a .com file.

However, if you are using the environment to develop your program you might
do three different things if they aren't already.  The first is to turn off
the OBJ debugging information through the Options/Compile/Code_generation 
option.  While there also turn off the Line generation option.
Finally, under the Debug menu selection change Source_debugging to Off.
Using all of these has chopped a considerable amount of K off a rather large
program.  I suspect that the amount of savings is based on the size of the
program in general and may also relate to the arrangement of the source code.
By this I mean that if you have a single statement that has been broken down
into two lines like 

	if (such_n_such == TRUE)
		do_something_stupid();

the executable is generated with information that lets you perform breakpoints
for each line.  

BTW, the program I'm working on is about 67K with debugging information but
is reduced to about 52K with all of this turned off.  The savings can be
substantial.  

Also, don't forget that some of the standard library functions such as
printf() and its relatives (cprintf(), fprintf(), sprintf()) are large by
necessity.  If you can avoid these by using lesser functions like puts()
or by writting your own functions that strip out the unnecessary formatting
you can save some space (by unnecessary I mean that printf() also has to
account for formattting floating point and double in addition to variable
length strings.  Even if a program doesn't need to print any floating point
value, these options are still part of the printf() function).
 

*From: mdfreed at contact.uucp

(large Turbo C executables)
    Sounds like you're including the debugging information in the
executable. This is the default setup in the IDE ... not sure about
the command line compiler.
    If you have Turbo C Professional (includes TASM and TDEBUG), it
includes TDSTRIP, a utility to remove the extra data. 
    Otherwise, set the options to eliminate line numbers and
debugging info.
    If you're trying to minimize size, using the individual conversion 
functions and fputs() instead of fprintf() saves a bit of space.
   Borland includes an example which uses DOS and BIOS calls, and
avoids the library completely. This approach produces the smallest
and least portable program (one might say that it isn't really C).

Richard T. Ferris
ferris at eniac.seas.upenn.edu
University of Pennsylvania



More information about the Comp.lang.c mailing list