UNIX SHELL PROG. & ELM QUESTIONS

Chris Lewis clewis at ferret.ocunix.on.ca
Tue May 14 03:07:45 AEST 1991


In article <1991May12.022641.18961 at mp.cs.niu.edu> rickert at mp.cs.niu.edu (Neil Rickert) writes:
>In article <1991May10.064610.25802 at starnet.uucp> moe at starnet.uucp (Moe S.) writes:
>>1. If I have a large (500+ messages) in a mailbox-format file,
>>   what is the best way to mail a file to everyone of the 500+? 
>>   (using elm or any other way).

> 500 is probably stretching the capabilities of much software.  Most mailers
>pass the message to the transport (MTA) as arguments, and 500 may exceed
>the max allowed.

> If you are running sendmail as an MTA, the easiest way may be to
>extract the 'From:' lines and make each into a 'Bcc:' line for the
>new message which you then feed into 'sendmail' with the '-t'
>option (which implies that the recipient addresses come from 'To:'
>'Cc:' and 'Bcc:' headers.)  You message can also include a 'To:'
>with a group name - 'To: multiple_recipients:;' to make sure
>an 'Apparently-To:' in not generated.

You don't have to resort to all this wierdness.  If you're using sendmail,
smail 2.5 or smail 3.1, plus probably many other MTA's, what you really
want to do is prepare a file containing all of the recipients, and use
the "include" mechanism in the "aliases" file for the MTA (in sendmail
and smail 2.5, don't know about smail 3.1) it's /usr/lib/aliases.

Ie, for the ferret mailing list I have:

    ferret-list-out	:include:/u/clewis/ferrets/mail-list
			:include:/u/clewis/ferrets/anon-list

The files are in the following format:
	e-mail-address (full name)
The full name is optional.

Then, if you send mail to "ferret-list-out", the MUA (mush/elm/mailx/Mail etc.)
doesn't even know about the alias - none of the mail headers have any of the
names in the subscription list.  The MTA, *not* the MUA, does the expansion in memory,
and parcels out groups of the addresses in the command lines to multiple
invocations of uux or tcpip etc.  Smail 2.5 doesn't appear to have any limit
on the number of recipients other than available memory (it mallocs the entries
into a linked list)

In fact, it's often better to invoke the MTA directly rather than using the MUA
to send it, because you have a bit better control of what the headers will
look like.  For example, you want the "From:" line to refer to the
logical address for sending in individual items.  This is a copy of the
shell script I use to send out mailing list items:

    #	Takes one argument - the item to be sent.
    if [ ! -r "$1" ]
    then
	echo "No such article"
	exit
    fi
    #	Check that I've not buggered up the numbering scheme
    if [ -r articles/$1 -o -r articles/$1.Z ]
    then
	echo "Article Clash $1"
	exit
    fi
    #	Construct Envelppe
    echo "Subject: Issue $1" > /tmp/$$
    echo "From: ferret-list at ferret.ocunix.on.ca (Ferret Mailing List)" >> /tmp/$$
    echo "To: ferret-list at ferret.ocunix.on.ca" >> /tmp/$$
    echo "" >> /tmp/$$
    #	Send it
    cat /tmp/$$ $1 | smail -R ferret-list-out
    rm -f /tmp/$$
    #	Archive what I just sent
    mv $1 articles
    compress articles/$1

Notice that I construct the Subject:, From: and To: lines myself, tack on
a blank line, and then concatenate article itself and shove thru smail directly.
The destination is the command line argument to smail, not the To: line.
(With sendmail you may have to use an option to inhibit To: line expansion.)

(The -R option to smail 2.5 tells it to reroute all of the addresses instead
of trying to send directly then discovering most of the addresses are not
full paths, and then rerouting.  This is an efficiency concern, plus
the fact that without the -R, smail 2.5 won't multicast unless the addresses
are full bang path.  Multicast is more than one recipient per uux invocation.
REAL important with a list of 500 entries!  The ferret list is about 75, and
it multicasts down to 17 individual uux invocations)

>>2. If I have a file containing some names and email addresses such as
>>   this: 
>>	      xyz at jkjk.jkyu.reyui (John J. Doe)
>>	      Mark L. Lost <apple!mark at eee.dfsjk.jkj>
>>	      Joe!!! jjj at jhdf.434r.er
>>  How can I re-organize the file (using awk, sed, etc...) so that
>>  the email addresses are the first field in every line in the file? 

> Very difficult.  Probably beyond the abilities it awk, sed, etc.  If
>Larry Wall happens to be reading this he may suggest perl.  The trouble is
>that the syntax of RFC822 addresses is quite complex, and as X.400 gateways
>become more common the extreme cases of RFC822 addresses are increasingly
>likely to show up.

Very difficult only if the input is entirely arbitrary and you actually
have to parse the addresses.   However, the first and second addresses are
already in a "standard" form, the first being what you can use directly (at
least in a smail 2.5 alias file).  The second is simple to convert.  Then,
it's a matter of converting all of the other formats into the first one.

The third example is a difficult one to handle simply because
a simple sed script can't tell which of the two tokens is the actual address
because they both have mailing metacharacters.  So, you make a simplifying
assumption, and assume that a token with a "@" is a real email address, and
alternately, a token of the form something!something is the real email address
as long as there aren't more than one adjacent !.

This sed script works for the above sample plus some other forms:

	sed -e '/^\(.*\)<\(.*\)>\(.*\)$/s//\2 \1 \3/' \
	    -e '/^\(.*\)(\(.*\))\(.*\)$/s//\1 \2 \3/' \
	    -e '/^\(.*\) \([^ ][^ ]*@[^ ][^ ]*\)$/s//\2 \1/' \
	    -e '/^\(.*\) \([^! ][^ ]*![^! ][^ ]*\)$/s//\2 \1/' \
	    -e 's/^  *//' \
	    -e 's/  *$//' \
	    -e 's/  */ /g' \
	    -e 's/ / (/' \
	    -e 's/$/)/'

	1) convert <> forms to address first.
	2) remove () from addr (name) forms
	3) Move all tokens with something at something to the beginning
	4) Move all tokens with something!something to the beginning
	   (don't do this for tokens with !!!* )
	5, 6, 7) Strip extraneous blanks
	8, 9) put the () back in.

Yes, it would be a bit easier to program in perl, and easier to get fancier.
But not particularly necessary.  You'll probably end up with a few it
didn't parse correctly, but you can fix them manually.
-- 
Chris Lewis, Phone: (613) 832-0541, Domain: clewis at ferret.ocunix.on.ca
UUCP: ...!cunews!latour!ecicrl!clewis; Ferret Mailing List:
ferret-request at eci386; Psroff (not Adobe Transcript) enquiries:
psroff-request at eci386 or Canada 416-832-0541.  Psroff 3.0 in c.s.u soon!



More information about the Comp.unix.wizards mailing list