bugfix and extension to slice
Michael Greim
greim at sbsvax.UUCP
Tue May 17 19:06:42 AEST 1988
Hello netland,
Some months ago a program called 'slice' was posted comp.sources.unix.
We here tried it, but found a bug almost instantly. Our system
administrator wanted to use slice to extract tar mail pieces from
a mail box so I made 2 extensions of slice for him.
I sent the following to Rich Salz, suggesting a reposting, but did
not hear anything from him. So I assume it's ok, if I present my changes.
Here are
1.) the BUG
2.) 2 extensions to slice, description
3.) context diff of slice.c (a cure for the ailment)
4.) context diff of slice.1
5.) the tarmail extracting script
1.) the BUG
1.1) Symptoms
I tried "slice -f file -n100 A#n" and exspected slice to produce
some file Ann. But to my suprise it said : "can not use -n option
together with pattern" or some such.
1.2) Diagnosis
The first command line option not starting with '-' was considered
a pattern regardless of other options specified.
1.3) Therapy
Apply the context diff in 3.
2.) 2 extensions to slice, description
The extensions are in the substitution ability.
#0nn : with this format you can specify up to 99 parameters instead
of only 9. We needed this!
#-nn : take the nn'th parameter from the last. nn=0 means the last
parameter. This is equal to #$ when you have less than 99 parameters.
NOTE:
To make this work properly set MAXPARM in opts.h to 99.
(no context diff include because of {what-you-like} :-)
Apply the context diff in 3.
3.) context diff of slice.c (a cure for the ailment)
*** slice.c.old Wed Mar 23 18:41:41 1988
--- slice.c Wed Mar 23 18:40:42 1988
***************
*** 43,48 ****
--- 44,52 ----
bool exclude = FALSE; /* exclude matched line from o/p files */
bool split_after = FALSE; /* split after matched line */
bool m_flag = FALSE; /* was -m option used */
+ bool s_flag = FALSE; /* was -s option used */
+ bool n_flag = FALSE; /* was -n option used */
+ bool e_flag = FALSE; /* was -e option used */
FILE *output = (FILE *) NULL; /* fd of current output file */
FILE *rejectfd = (FILE *) NULL; /* fd of reject file */
***************
*** 105,110 ****
--- 109,115 ----
usage(1);
}
pattern = *argv;
+ e_flag = TRUE;
break;
}
case 'm': { /* mailbox pattern */
***************
*** 113,119 ****
break;
}
case 's': { /* shell pattern */
! pattern = "^#! *\/bin\/sh";
break;
}
case 'n': { /* -n n_lines -- split every n lines */
--- 118,125 ----
break;
}
case 's': { /* shell pattern */
! pattern = "^#! *\\/bin\\/sh";
! s_flag = TRUE;
break;
}
case 'n': { /* -n n_lines -- split every n lines */
***************
*** 123,128 ****
--- 129,135 ----
error("-n: number must be at least 1\n");
exit(EXIT_SYNTAX);
}
+ n_flag = TRUE;
break;
}
case 'f': {
***************
*** 163,179 ****
}
} /* end switch */
} else {
! if (!pattern) pattern = *argv; /* first non-flag is pattern */
else break; /* break while loop */
} /* end if */
} /* end while */
if (!argc) {
! if (m_flag) {
format = mboxformat;
! } else {
format = defaultfmt;
- }
n_format = 1;
} else {
format = argv;
--- 170,195 ----
}
} /* end switch */
} else {
! /*
! * mg, 22.mar.88
! * the first non-flag is pattern, if not one of -s -n or -m
! * was specified or -e pattern
! */
! if (!pattern && !m_flag && !s_flag && !n_flag)
! pattern = *argv; /* first non-flag is pattern */
else break; /* break while loop */
} /* end if */
} /* end while */
+ if (e_flag && (m_flag || s_flag || n_flag)) {
+ error("don't use -e together with -m, -s or -n flags\n");
+ usage(EXIT_SEMANT);
+ }
if (!argc) {
! if (m_flag)
format = mboxformat;
! else
format = defaultfmt;
n_format = 1;
} else {
format = argv;
***************
*** 486,491 ****
--- 506,539 ----
q += strlen(tempbuf);
break;
}
+ /*
+ * mg, 18.mar.88
+ * - use #0nn to specify parameter numbers greater than 9
+ * - use #-nn to select the nn'th parameter from the last
+ * #-00 is equivalent to #$
+ */
+ case '-':
+ case '0':
+ if (!isdigit(*(p+1)) || !isdigit(*(p+2))) {
+ error("Invalid use of #%cnn format in '%s'\n", *p, *format);
+ exit(EXIT_RUNERR);
+ }
+ i = (*(p+1) - '0') * 10 + *(p+2) - '0';
+ if (i > MAXPARM) {
+ error("Number of parameter (%1d) exceeds max (%1d)\n", i, MAXPARM);
+ exit(EXIT_RUNERR);
+ }
+ if (*p == '-') {
+ j = lastparm ();
+ if (j < i) {
+ error ("Not enough parameters to take difference.\n");
+ exit (EXIT_RUNERR);
+ }
+ i = j - i;
+ } else
+ i--;
+ p += 2;
+ goto do_form;
case '1':
case '2':
case '3':
***************
*** 501,506 ****
--- 549,555 ----
} else {
i = (*p) - '1';
}
+ do_form:
if (*(p+1) == '%') {
p++;
fmtcode = getfmt(fmt,p);
4.) context diff of slice.1
*** slice.1.old Wed Mar 23 18:42:28 1988
--- slice.1 Wed Mar 23 18:40:42 1988
***************
*** 38,45 ****
into one or more output files. The output files are named according
to the \fIformat\fR strings provided. The input file is split
whenever a pattern is matched or every \fIn\fR lines, depending on the
! options selected. Because some of the options are mutually exclusive,
there are three forms of the command.
.LP
Whenever a pattern match is used to slice the file, lines occurring
before the first match are sent to the \fIreject\fR file (which is
--- 38,47 ----
into one or more output files. The output files are named according
to the \fIformat\fR strings provided. The input file is split
whenever a pattern is matched or every \fIn\fR lines, depending on the
! options selected.
! Because some of the options are mutually exclusive,
there are three forms of the command.
+ It is an error to specify a pattern together with options -m, -s or -n.
.LP
Whenever a pattern match is used to slice the file, lines occurring
before the first match are sent to the \fIreject\fR file (which is
***************
*** 111,119 ****
output file produced by the current output format. When an output
format produces the same name twice, a new format is selected and
numbering begins again with the initial value.
! .IP "#\&1, #\&2 ..."
! Parameters of the form #\&1, #\&2, ... #\&9 are replaced by corresponding
tokens drawn from the source line which matched the slice pattern.
For example, if each procedure in a C program began with a comment
line of the following form:
.sp
--- 113,129 ----
output file produced by the current output format. When an output
format produces the same name twice, a new format is selected and
numbering begins again with the initial value.
! .IP "#\&1, #\&2 ..., #\&0nn, #\&-nn"
! Parameters of the form #\&1, #\&2, ... #\&9 or #\&0nn, where 'nn' is
! a 2 digit number are replaced by corresponding
tokens drawn from the source line which matched the slice pattern.
+ If you specify #\&-nn, you can select a parameter relative from
+ the last token on the line. #\&-00 is the last token on the line,
+ #\&-01 the last but one, ...
+ .br
+ Note that it is an error to not specify two digits when using #\&0nn
+ or #\&-nn.
+ .br
For example, if each procedure in a C program began with a comment
line of the following form:
.sp
***************
*** 131,136 ****
--- 141,149 ----
\ \ \ \ \From garyp at cognos Tue Sep 15 15:08:23 EDT 1987
.sp
then "#$" would select "1987", the last token on the line.
+ .br
+ Currently there are 99 addressable tokens on an input line. If a line
+ is split in more tokens, #$ will hold the last one.
.SH FORMAT SPEC's
.LP
Substitution parameters can be followed by an optional
***************
*** 240,245 ****
--- 253,264 ----
generate the correct filenames, either slice has to lookahead to find
the next match line or it has to direct lines for the current slice
into a temporary file until it finds the line matching the pattern.
+ .IP c) 4
+ When you use slice on machines with a filesystem which allowes you
+ only a (usually small) amount of characters for filenames (i.e. 14),
+ slice might not detect that it is overwriting a file and/or
+ its diagnostic output is false. Especially filenames generated by the -m
+ option are too long. Just specify a format when slicing a mailbox.
.SH DIAGNOSTICS
``Internal Error'' indicates a bug in \fIslice\fR, and should be reported.
Exit status 1 indicates an error parsing options \- for example, if an unknown
***************
*** 249,254 ****
--- 268,279 ----
be opened.
.LP
If a reject file is not provided, a count of rejected lines is reported.
+ .SH "AUTHOR"
+ Originally written by Russell Quinn as "mailsplit".
+ .sp
+ Revised and extended by Gary Puckering <cognos!garyp>.
+ .sp
+ Extended some more by Michael Greim.
.SH "SEE ALSO"
.I cat (1),
.I ed (1),
5.) the tarmail extracting script
The author, Bernard Sieloff (bs at sbsvax.UUCP), says it could be improved,
but it is already 30% faster than the version using csplit.
#! /bin/sh
# @(#)untarpack 2.1 (UniSB[bs]) 88/03/20
PATH=/usr/ucb:/bin:/usr/bin:/usr/local
if [ $# -lt 1 -o $# -gt 2 ]; then
echo "Usage: untarpack \"subject-string\"[ your-tarmailbox]"
exit 1
fi
trap 'echo "untarpack: cancelled"; exit 9' 1 2 3 15
TS=$1;
if [ $# -eq 2 ]; then
MB=$2
else
MB=/usr/spool/mail/$USER
fi
if [ ! -s $MB ]; then
echo "untarpack: no such file: $MB"
exit 1
fi
rm -f utm.boxfile.???-of-???
echo "starting unpacking now---please wait..."
sed -n -e "/^Subject: $TS - part/,/^---end beef/p" $MB |
slice "^Subject: $TS - part" 'utm.boxfile.#-02%03d-of-#$%03d'
if [ $? -ne 0 ]; then
echo "untarpack: slice error"
exit 2
fi
if [ ! -s utm.boxfile.001-of-??? ]; then
echo "untarpack: can't find subjects \"$TS\" in file \"$MB\""
exit 3
fi
FOUND=`ls utm.boxfile.???-of-??? | wc -l`
PACKS=`expr substr utm.boxfile.001-of-??? 20 3`
if [ $FOUND -lt $PACKS ]; then
FOUND=`expr $FOUND + 0`
PACKS=`expr $PACKS + 0`
echo "untarpack: lack of tarmail packets ($FOUND instead of $PACKS)"
exit 4
elif [ $FOUND -gt $PACKS ]; then
FOUND=`expr $FOUND + 0`
PACKS=`expr $PACKS + 0`
echo "untarpack: packet overrun?!? ($FOUND instead of $PACKS)"
exit 5
fi
echo '---end beef' > utm.boxfile.000-of-$PACKS
echo -n "Done---do you want to UNTARMAIL the tarmail? [y/n]:"
read answer junk
answer=${answer}x
if expr $answer : '[yY].*x'>/dev/null; then
echo "OK---UNTARMAILing your tarmail..."
exec untarmail utm.boxfile.???-of-???
else
echo 'Use "untarmail utm.boxfile.???-of-???" to reconstruct the TARMAIL'
fi
exit 0
Absorb, apply and enjoy,
Michael
--
snail-mail : Michael Greim,
Universitaet des Saarlandes, FB 10 - Informatik (Dept. of CS),
Bau 36, Im Stadtwald 15, D-6600 Saarbruecken 11, West Germany
E-mail : greim at sbsvax.UUCP
More information about the Comp.bugs.4bsd.ucb-fixes
mailing list