New(?) encryption algorithm

Alan Grant Finlay alanf at bruce.OZ
Tue Mar 6 16:54:33 AEST 1990


I have written an encription system for personal files on an IBM-PC compatible.
The algorithm is designed to be fast enough for small files and totally secure
if a key is chosen properly.  I mean it should be secure against known plain
text attack but not chosen plain text attack where large numbers of attempts are
permitted.  Included in this article are a source program for the system in C,
a user manual, and a formal description of the algorithm.  This is my first 
non trivial C program so I welcome any constructive criticism.  I would also 
like to know if the formal description is implemented correctly in C.  I am
an amateur at cyphers so I am not certain that my claims about security are
correct.  I would appreciate any comments by those more knowledgeable about
the security of this system.  An archive with all relevant files including
executables will be posted soon in comp.binaries.ibm.pc, included is a challenge
file which is an encription of the source file using a key known only to me.
If anyone would like to attempt a chosen plaintext attack I am willing to use
this same key to encrypt a limited number of files.  I am not sure if I need to
supply a secret file encrypted using this key to make the challenge more 
meaningful.  The program is currently called CRYPT 2.0, I apologise for using a
name that has probably already been used but it's a bit late to change now.

----------------------------------------------------------------------------
                         User manual for Crypt
                         ---------------------

The source code "crypt.c" can be used to produce two  executable  files  which
are  assumed  to be called "encrypt" and "decrypt" by the program itself.  The
source is in Turbo C (Trademark) and can be  compiled  with  the  tiny  memory
model.   Since  use  is  made  of system dependent functions for obtaining the
input file size for encrypt, the program is not portable as is.

To use encrypt and decrypt simply type the command name followed by the source
and  destination  file  names.  You will be prompted for a password and status
information is produced.  The password is  echoed  since  for  the  length  of
password  intended it would be too easy to mess it up.  The password should be
a long sequence of characters such as a sentence with some  strange  words  or
characters  in  it.  The maximum length is the same as the maximum block array
width (100 as supplied).  The password is scrolled off the  screen  after  the
line is terminated.

The algorithm used was inspired by the Rubik's cube.  The core algorithm works
as follows:

   First a fixed block size is determined which is the product of two numbers
   "wide" and "high".  The data is processed in a two dimensional array with
   these dimensions.  If the last block cannot be entirely filled it is
   padded with junk characters.  The password is used to determine a key on
   a column per character basis.  The key value for each column (modulo the
   height) is the amount this column is shifted down.  The part which emerges
   from the bottom is inserted into the top (i.e. a cyclic shift).  The rows
   are similarly shifted but using the values 1,2,3,...,etc.  This two stage
   process is repeated a fixed number of times (10+10) times in the original
   system).  The result is a permutation of the array determined by the
   password alone.

There is one major problem with  this  simple  scheme.   The  use  of  a  pure
permutation  is  compromised by the need to pad the last block.  Especially if
the last block is mostly padding.  It is very difficult to choose padding that
cannot  be  distinguished  from the intentional contents without revealing too
much.  The upshot of all this is that the scheme is  modified  slightly.   For
the first one half of the passes, after the usual column and row shifts, every
second row is replaced by the exclusive-or of  itself  and  the  previous  row
character  by  character.   This  means the shuffling process is now more than
just a pure permutation but alters values according  to  the  history  of  the
shuffle.   As  a  precaution  the rest of the passes are pure permutations.  A
crude  analogy  would  be  shuffling  cards  with  the  printing  still   wet.
Fortunately  the  process  is  reversible when done digitally.  In addition to
this precaution the block size is usually chosen so that the last block is  at
most  a  quarter  padding  (the rare exceptions are reported when they occur).
Finally the padding is  chosen  by  randomly  selecting  bytes  from  the  old
contents of the array (usually bytes from this or a previous pass).

If the input file is smaller than 200 bytes it is rejected.  This may seem  to
be  over cautious but remember that if a password is discovered by cracking an
easy case, every file encoded with that same password  is  available.   It  is
intended  that  users  will  not use many distinct passwords since they should
never be written down and need to be long.  The security of the system is  due
more  to  the choice of password and particularly its length.  Passwords which
are  too  short  are  rejected.   Naturally  the  password  in  not   recorded
permanently  anywhere  by the encryption program.  The easiest way to choose a
good password is to use a whole sentence (spaces and punctuation are  allowed)
with  some  strange  characters  in  it.  It is relatively easy to try out all
short sentences with words from a standard dictionary.  Brute  force  cracking
of  an  encrypted  file  with  n blocks of block size b, a password of p units
chosen from a set of q values and with s shuffles per block requires O(s * b *
(q^p)) operations.  Obviously increasing q and/or p pays great dividends.

For a pure permutation it would also be possible to try all b!  rearrangements
of  a  block.   This  is ruled out by the minimum block size being 100.  It is
clear that increasing s is not an effective way of  improving  security.   The
time  taken by the algorithm is O(s * b * n).  Since (b*n) cannot be changed s
should be chosen as small as is considered safe.  The chosen value is as small
as  seems  secure  but  is  not  chosen  for any real theoretically determined
reason.  Note that s is a compile time constant.

When choosing block dimensions adequate width is more important than  adequate
height.   If  the  width  is  shorter  than  the password then the tail of the
password is not used.  The minimum block dimensions are approximately 24  *  8
so  that  the  algorithm  has  enough  height  to minimise the chance of short
cycles.  Every shuffle is identical and could contain a small number of  short
cycles  (i.e. a small set of positions whose contents always remain within the
set).  Experiments seem to indicate this is not likely with  reasonably  sized
blocks  (i.e.  >200  bytes).   No  theoretical analysis of this issue has been
attempted.  Note however that the row shifts are chosen as 1..n for  a  column
or row of n+1 positions hence there is never a zero row shift.

So far the algorithm's strength depends critically upon  the  variety  of  the
input.   As  described  so for if you input a file with just space characters,
that is mostly what you will get back.  This applys on a block by block basis.
To  avoid  this  problem  the  algorithm  is  given  one final twist, a simple
substitution is performed before the first shuffle.  This substitution  should
suffice to eliminate most simple patterns.

Some final operational advice.  Confidential documents should be created using
a  ram  disk or a floppy which will be destroyed.  Note that deleting and even
overwriting magnetic media does not prevent the data's  recovery.   Watch  out
for  editors  which keep temporary copies of files which they are working upon
in places of their own choice.  A  deleted  temporary  file  is  often  easily
recovered and you may not even be aware of its existence.

-------------------------------------------------------------------------
/***********************************************************************/
/* (c) Copyright 1989, 1990, Alan Finlay.         Beta version 2.0 .    */
/* This program is intended for Public Domain, and may not be sold or    */
/* marketed in any form without the permission and written consent of     */
/* the author.  I retain all copyrights to this program, in either the    */
/* original or modified forms, and no violation, deletion, or change of   */
/* the copyright notice is allowed.  Every effort has been made to ensure */
/* this product provides a secure encryption system.  Mistakes can be     */
/* made however both in the theoretical basis and implementation of such  */
/* a system.  See the associated documentation for a discussion of known  */
/* limitations.  The source code has been provided so you can inspect it  */
/* and use it at your own risk.  I will accept no responsibility for any  */
/* loss, damage, or compromised data caused directly or indirectly by     */
/* this program.  It is released on an "as is" basis with no warranty    */
/* as to its being fit or suitable for encrypting any form of data.     */
/***********************************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <dos.h>
#include <time.h>
#include <sys\stat.h>

#define ENCRYPT 1        /* Choose encrypt (1) or decrypt (0) */
#define DEBUG 0          /* Show page after each shuffle if non zero */
#define TRUE -1
#define FALSE 0
#define LIMIT 100
#define SECURE 10

typedef unsigned char line[LIMIT];

char copyright[40] = "(c) copyright 1989,1990, Alan Finlay";

unpat(page,wide,high) /* Simple substitution to avoid patterns */
   line page[LIMIT];  /* [width,height] */
   int wide,high;
   {
   int i,j,k;
   k = 0;
   for (i=0;i<wide;i++) for (j=0;j<high;j++) {
      k = (k+7)%128;
      page[i][j] = page[i][j] ^ k;
      }
   }

#if ENCRYPT
shuffle(page1,code,wide,high)
   line page1[LIMIT];  /* [width,height] */
   line code;
   int wide,high;
   {
   int i,j,k,key,shift;
   line *mix1,*mix2;
   line *oldline,page2[LIMIT];  /* [height,width] */
   for (k=0;k<(SECURE*2);k++) { /* See mixup below (2 security phases) */
#if DEBUG
      show(page1,wide,high);
#endif
      /* Shift columns */
      for (i=0;i<wide;i++) {
         oldline = page1[i];
         key = (int) code[i];
         for (j=0;j<high;j++) page2[j][i] =(*oldline)[(j+key)%high];
         }
      /* Mixup */
      if (k<SECURE) {
	 for (j=1;j<high;j+=2) {
            mix1 = page2[j-1];
            mix2 = page2[j];
	    for (i=0;i<wide;i++) (*mix2)[i] = (*mix2)[i]^(*mix1)[i];
            }
         }
      /* Shift rows */
      for (j=0;j<high;j++) {
         oldline = page2[j];
         shift = (j%(wide-1))+1;
         for (i=0;i<wide;i++) page1[i][j] = (*oldline)[(i+shift)%wide];
         }
      }
   show(page1,wide,high);  /* For text, user can check for blunders */
   }

#else
unshuffle(page1,code,wide,high)
   line page1[LIMIT];  /* [width,height] */
   line code;
   int wide,high;
   {
   int i,j,k,key,shift;
   line *mix1,*mix2;
   line *newline,page2[LIMIT];  /* [height,width] */
   for (k=0;k<(SECURE*2);k++) {  /* Check this times 2 as in shuffle */
#if DEBUG
      show(page1,wide,high);
#endif
      /* Shift rows back */
      for (j=0;j<high;j++) {
	 newline = page2[j];
         shift = wide-(j%(wide-1))-1;
         for (i=0;i<wide;i++) (*newline)[i] = page1[(i+shift)%wide][j];
         }
      /* Reverse mixup */
      if (k>=SECURE) {    /* Note matching code in shuffle */
	 for (j=1;j<high;j+=2) {
            mix1 = page2[j-1];
            mix2 = page2[j];
	    for (i=0;i<wide;i++) (*mix2)[i] = (*mix2)[i]^(*mix1)[i];
            }
         }
      /* Shift columns back */
      for (i=0;i<wide;i++) {
         newline = page1[i];
         key = (int) code[i];
         for (j=0;j<high;j++) (*newline)[(j+key)%high] = page2[j][i];
         }
      }
   }
#endif

show(page,wide,high)
   line page[LIMIT];
   int wide,high;
   {
   int i,j;
   puts("\n");
   for (j=0;j<high;j++) {
      putc('\n',stdout);
      for (i=0;i<wide;i++) {
         if (page[i][j]<30) putc('*',stdout);
         else putc(page[i][j],stdout);
         }
      }
   }

main (argc,argv)
   int argc;
   char *argv[];
{
   FILE *infile,*outfile;
   int wide,high,i,j,k;    /* Block width and height, loop counters */
   int blkn = 1;           /* Block counter */
   int clen;               /* Password code length */
   int ch = 0;
   int invers;             /* Version of input file for decrypt */
   int vers = 2;           /* Version of this program */
   line page[LIMIT],code;
#if ENCRYPT
   int chrcnt;       /* Character counter */
   long fsize;       /* Input file size */
   int blocksize,nblocks;
   struct time t;  /* For system time */
   struct stat st; /* For input file stats */
   /* Randomise the rand() function */
   gettime(&t);
   srand(t.ti_min*400+t.ti_sec*100+t.ti_hund); /* random seed <30000 */
   /* Check the input arguments */
   if (argc!=3) {puts("\nUsage is: encrypt src dst\n"); exit(1);}
#else
   int blkcnt;
   /* Check the input arguments */
   if (argc!=3) {puts("\nUsage is: decrypt src dst\n"); exit(1);}
#endif
   if ((infile = fopen(argv[1],"rb")) == NULL) {
      printf("\n%s",argv[1]); perror(" "); exit(1);}
   if ((outfile = fopen(argv[2],"wb")) == NULL) {
      printf("\n%s",argv[2]); perror(" "); exit(1);}
#if ENCRYPT
   /* Get input file size */
   if (stat(argv[1],&st)!=0) {perror(" "); exit(1);}
   fsize = st.st_size;
   printf("The input file size is %ld\n",fsize);
   /* Choose block size accordingly */
   if (fsize<200) {puts("Input file is too small"); exit(1);}
   if (fsize<(LIMIT*LIMIT)) blocksize = (int) fsize;
   else {
      nblocks = (int) (fsize/(LIMIT*LIMIT)+1);
      blocksize = (int) (fsize/nblocks+1);
      }
   wide = ((int) sqrt((double)blocksize))+10; if (wide>LIMIT) wide = LIMIT;
   high = blocksize/wide+1; if (high>LIMIT) high = LIMIT;
   while (1) {
      blocksize = wide*high;
      if (fsize<(long) blocksize) break;
      else {
         /* Multiple blocks, check for last block too small */
         if (((fsize-1)%blocksize)>(blocksize*3/4)) break;
         /* (fsize-1) is used above so perfect fit is accepted! */
         high--; wide--; /* Try a smaller block */
         }
      if (wide<50) {
         puts("It was necessary to use a short last block");
         puts("To correct this change the input file size");
         puts("Otherwise there is a minor security compromise");
         break;
         }
      }
   printf("The width and height are (%d,%d)\n",wide,high);
   printf("The last block is %ld bytes\n",((fsize-1)%blocksize)+1);
   fprintf(outfile,"%d,%d,%d,",vers,wide,high);
#else
   fscanf(infile,"%d,%d,%d,",&invers,&wide,&high);
   if (invers!=vers) {puts("Input file is version incompatible");exit(1);}
#endif
   /* Get password */
   while(1) {
      puts("\nPlease enter your password");
      gets(code);
      clen = strlen(code);
      if (clen>LIMIT) {puts("Password too long - abort"); exit(1);}
      if (clen>9) break;
      puts("Insecure password, try a longer one.");
      puts("For security do not use a name or word in any dictionary.");
      puts("For example use something like \"Dazed and Konfuzed\"");
      }
   for (i=0;i<25;i++) puts(" ");   /* Clear the screen */
   if (clen>wide) puts("Warning: tail of password ignored");
   /* Extend password to possible limit */
   for (i=clen;i<LIMIT;i++) code[i] = code[i%clen] ^ '\152';
   do { /* crypt a page */
#if ENCRYPT
      chrcnt = 0;
#else
      if (fscanf(infile,"%d,",&blkcnt)==EOF) goto NOBLOCK;
#endif
      for (j=0;j<high;j++) {
         for (i=0;i<wide;i++) {
            if ((ch = getc(infile)) != EOF) {
               page[i][j] = ch;
#if ENCRYPT
               chrcnt++;}
            else if (i==0 && j==0) goto NOBLOCK; /* EOF at start of block! */
            /* Pad the last block with existing junk */
            else page[i][j] = page[rand()%wide][rand()%high];
#else
               ;}
            else {puts("Error: unexpected end of file"); goto NOBLOCK;}
#endif
            }
         }
#if ENCRYPT
      fprintf(outfile,"%d,",chrcnt);
      unpat(page,wide,high);
      shuffle(page,code,wide,high);
      for (j=0;j<high;j++) for (i=0;i<wide;i++) putc(page[i][j],outfile);
#else
      unshuffle(page,code,wide,high);
      unpat(page,wide,high);
      for (j=0;j<high;j++) for (i=0;i<wide;i++)
         if ((j*wide+i)<blkcnt) putc(page[i][j],outfile);
#endif
      printf("\nFinished block number %d\n",blkn++);
      }
   while (ch != EOF);
NOBLOCK:                  /* Jump here to avoid writing an empty block */
   fclose(infile);
   fclose(outfile);
}
----------------------------------------------------------------------------

                   Formal description of the algorithm
                   -----------------------------------
A block size is determined according to the size of the input file.
For small files there will be only one block.  Each block has two dimensions -
wide and high.  The last block usually needs some padding which is chosen from
whatever happens to be in the array already.  The junk is selected randomly 
though the random number generator has only about 30,000 possible seeds.  
This is probably much ado about nothing, the padding could probably be just
null characters without any loss in security.  The password or key is a string
supplied by the user, this is extended to the maximum length (100) by 
repetition however the repeated strings are modified by exclusive-or '\152'
to help extract the maximum information from the supplied string (since later
only the remainder mod high will be utilised).

For each block, let us call it page[x,y] where 0<=x<wide and 0<=y<high.
-----------------------------------------------------------------------
	1) Perform a simple substitution:
           page[x,y] := page[x,y] XOR ((x*high+y)*7 mod 128)

	2) Do 10 special shuffles: (each line is for all x and y concurrently)
           page[x,y] := page[x,(y+key[x]) mod high]
	   if y is odd then page[x,y] := page[x,y-1] XOR page[x,y]
           page[x,y] := page[(x+(y mod (wide-1))+1) mod wide,y]

	3) Do 10 normal shuffles: (each line is for all x and y concurrently)
           page[x,y] := page[x,(y+key[x]) mod high]
           page[x,y] := page[(x+(y mod (wide-1))+1) mod wide,y]

Each step in the process is easily reversible.
-------------------------------------------------------------------------

The encrypted file format is:
	1) three 16 bit integers: program version number (= 2), wide, high.
        3) for each block: 
		one 16 bit integer = number of chars in block (non padding).
		the block of bytes:
			for j:=0..high-1
			   for i:=0..wide-1
			      page[i,j]

----------------------------<end>-------------------------------------



More information about the Comp.lang.c mailing list