This chapter describes how the various UUCP protocols work, and discusses some other internal UUCP issues.
This chapter is quite technical. You do not need to understand it, or even read it, in order to use Taylor UUCP. It is intended for people who are interested in how UUCP code works.
This chapter is also, unfortunately, somewhat out of date, although I believe that is incomplete rather than inaccurate. I post this information to the newsgroups `comp.mail.uucp' and `news.answers' each month; if you want to write code based on this information, please get the most recent copy.
Most of the discussion covers the protocols used by all UUCP packages, not just Taylor UUCP. Any information specific to Taylor UUCP is indicated as such. There are some pointers to the actual functions in the Taylor UUCP source code, for those who are extremely interested in actual UUCP implementation.
Modern UUCP packages support grades for each command. The grades generally range from `A' (the highest) to `Z' followed by `a' to `z'. Taylor UUCP also supports `0' to `9' before `A'. Some UUCP packages may permit any ASCII character as a grade.
On Unix, these grades are encoded in the name of the command file. A command file name generally has the form
C.nnnngssss
where nnnn is the remote system name for which the command is queued, g is a single character grade, and ssss is a four character sequence number. For example, a command file created for the system `airs' at grade `Z' might be named
C.airsZ2551
The remote system name will be truncated to seven characters, to ensure that the command file name will fit in the 14 character file name limit of the traditional Unix file system. UUCP packages which have no other means of distinguishing which command files are intended for which systems thus require all systems they connect to to have names that are unique in the first seven characters. Some UUCP packages use a variant of this format which truncates the system name to six characters. HDB uses a different spool directory format, which allows up to fourteen characters to be used for each system name. The Taylor UUCP spool directory format is configurable. The new Taylor spool directory format permits system names to be as long as file names; the maximum length of a file name depends on the particular Unix file system being used.
The sequence number in the command file name may be a decimal integer, or it may be a hexadecimal integer, or it may contain any alphanumeric character. Different UUCP packages are different.
Taylor UUCP creates command files in the function
zsysdep_spool_commands
. The file name is constructed by the
function zsfile_name
, which knows about all the different types
of spool directories supported by Taylor UUCP. The Taylor UUCP sequence
number can contain any alphanumeric character; the next sequence number
is determined by the function fscmd_seq
.
I do not know how command grades are handled in non-Unix UUCP packages.
Modern UUCP packages allow you to restrict file transfer by grade depending on the time of day. Typically this is done with a line in the `Systems' (or `L.sys') file like this:
airs Any/Z,Any2305-0855 ...
This allows only grades `Z' and above to be transferred at any
time. Lower grades may only be transferred at night. I believe that
this grade restriction applies to local commands as well as to remote
commands, but I am not sure. It may only apply if the UUCP package
places the call, not if it is called by the remote system. Taylor UUCP
can use the timegrade
and call-timegrade
commands
(see section When to Call) to achieve the same effect (and supports the
above format when reading `Systems' or `L.sys').
This sort of grade restriction is most useful if you know what grades
are being used at the remote site. The default grades used depend on
the UUCP package. Generally uucp
and uux
have different
defaults. A particular grade can be specified with the `-g' option
to uucp
or uux
. For example, to request execution of
rnews on airs with grade `d', you might use something like
uux -gd - airs!rnews <article
`uunet' queues up mail at grade `Z' and news at grade `d'. The example above would allow mail to be received at any time, but would only permit news to be transferred at night.
This discussion applies only to Unix. I have no idea how UUCP locks ports on other systems.
UUCP creates files to lock serial ports and systems. On most (if not all) systems, these same lock files are also used by cu to coordinate access to serial ports. On some systems getty also uses these lock files.
The lock file normally contains the process ID of the locking process. This makes it easy to determine whether a lock is still valid. The algorithm is to create a temporary file and then link it to the name that must be locked. If the link fails because a file with that name already exists, the existing file is read to get the process ID. If the process still exists, the lock attempt fails. Otherwise the lock file is deleted and the locking algorithm is retried.
Older UUCP packages put the lock files in the main UUCP spool directory, /usr/spool/uucp. HDB UUCP generally puts the lock files in a directory of their own, usually /usr/spool/locks or /etc/locks.
The original UUCP lock file format encoded the process ID as a four byte binary number. The order of the bytes was host-dependent. HDB UUCP stores the process ID as a ten byte ASCII decimal number, with a trailing newline. For example, if process 1570 holds a lock file, it would contain the eleven characters space, space, space, space, space, space, one, five, seven, zero, newline. Some versions of UUCP add a second line indicating which program created the lock (uucp, cu, or getty). I have also seen a third type of UUCP lock file which did not contain the process ID at all.
The name of the lock file is generally "LCK.." followed by the base name of the device. For example, to lock /dev/ttyd0 the file LCK..ttyd0 would be created. There are various exceptions. On SCO Unix, the lock file name is always forced to lower case even if the device name has upper case letters. System V Release 4 UUCP forms the lock file name using the major and minor device numbers rather than the device name (this is pretty sensible if you think about it).
Taylor UUCP can be configured to use various different types of locking.
The actual locking code is in the function fsdo_lock
.
The UUCP protocol is a conversation between two UUCP packages. A UUCP conversation consists of three parts: an initial handshake, a series of file transfer requests, and a final handshake.
Before the initial handshake, the caller will usually have logged in the called machine and somehow started the UUCP package there. On Unix this is normally done by setting the shell of the login name used to `uucico'.
All messages in the initial handshake begin with a `^P' (a byte with the octal value \020) and end with a null byte (\000).
Taylor UUCP implements the initial handshake for the calling machine in
fdo_call
, and for the called machine in faccept_call
.
The initial handshake goes as follows. It is begun by the called machine.
max-remote-debug
(see section Miscellaneous sys File Commands).
ulimit
value of the calling UUCP. The limit is
specified as a base 16 number in C notation (e.g., `-U0x1000000').
This number is the number of 512 byte blocks in the largest file which
the calling UUCP can create. The called UUCP may not transfer a file
larger than this. Supported by System V Release 4 UUCP. Taylor UUCP
understands this option, but never generates it.
Most UUCP packages will consider each locally supported protocol in turn
and select the first one supported by the called UUCP. With some
versions of HDB UUCP, this can be modified by giving a list of protocols
after the device name in the Devices file or the `Systems' file.
Taylor UUCP provides the protocol
command which may be used
either for a system (see section Protocol Selection) or a port (see section The Port Configuration File).
After the protocol has been selected and the initial handshake has been completed, both sides turn on the selected protocol. For some protocols (notably `g') a further handshake is done at this point.
Each protocol supports a method for sending a command to the remote system. This method is used to transmit a series of commands between the two UUCP packages. At all times, one package is the master and the other is the slave. Initially, the calling UUCP is the master.
If a protocol error occurs during the exchange of commands, both sides move immediately to the final handshake.
The master will send one of four commands: `S', `R', `X' or `H'.
Any file name referred to below is either an absolute pathname beginning
with `/', a public directory pathname beginning with `~/', a
pathname relative to a user's home directory beginning with
`~user/', or a spool directory file name. File names in the
spool directory are not pathnames, but instead are converted to
pathnames within the spool directory by UUCP. They always begin with
`C.' (for a command file created by uucp
or uux
),
`D.' (for a data file created by uucp
, uux
or by an
execution, or received from another system for an execution), or
`X.' (for an execution file created by uux
or received from
another system).
Taylor UUCP chooses which request to send next in the function
fuucp
. This is also where Taylor UUCP processes incoming
commands from the remote system.
master: `S from to user -options temp mode notify size'
The `S' and the `-' are literal characters. This is a request by the master to send a file to the slave. Taylor UUCP handles the `S' request in the file `send.c'.
The slave then responds with an S command response.
If the slave responds with `SY', a file transfer begins. When the
file transfer is complete, the slave sends a `C' command response.
Taylor UUCP generates this confirmation in fprecfile_confirm
and
checks it in fpsendfile_confirm
.
After the `C' command response has been received (in the `SY' case) or immediately (in an `SN' case) the master will send another command.
master: `R from to user -options size'
The `R' and the `-' are literal characters. This is a request by the master to receive a file from the slave. I do not know how SVR4 UUCP implements file transfer restart in this case. Taylor UUCP implements the `R' request in the file `rec.c'.
The slave then responds with an `R' command response.
If the slave responds with `RY', a file transfer begins. When the
file transfer is complete, the master sends a `C' command. The
slave pretty much ignores this, although it may log it. Taylor UUCP
sends this confirmation in fprecfile_confirm
and checks it in
fpsendfile_confirm
.
After the `C' command response has been sent (in the `RY' case) or immediately (in an `RN' case) the master will send another command.
master: `X from to user -options'
The `X' and the `-' are literal characters. This is a request
by the master to, in essence, execute uucp
on the slave. The
slave should execute `uucp from to'. Taylor UUCP
handles the `X' request in the file `xcmd.c'.
The slave then responds with an X command response.
In either case, the master will then send another command.
master: `H'
This is used by the master to hang up the connection. The slave will respond with an `H' command response.
After the protocol has been shut down, the final handshake is performed. This handshake has no real purpose, and some UUCP packages simply drop the connection rather than do it (in fact, some will drop the connection immediately after both sides agree to hangup, without even closing down the protocol).
That is, the calling UUCP sends six letter O's and the called UUCP replies with seven letter O's. Some UUCP packages always send six O's.
The `g' protocol is a packet based flow controlled error correcting protocol that requires an eight bit clear connection. It is the original UUCP protocol, and is supported by all UUCP implementations. Many implementations of it are only able to support small window and packet sizes, specifically a window size of 3 and a packet size of 64 bytes, but the protocol itself can support up to a window size of 7 and a packet size of 4096 bytes. Complaints about the inefficiency of the `g' protocol generally refer to specific implementations, rather than the correctly implemented protocol.
The `g' protocol was originally designed for general packet drivers, and thus contains some features that are not used by UUCP, including an alternate data channel and the ability to renegotiate packet and window sizes during the communication session.
The `g' protocol is spoofed by many Telebit modems. When spoofing is in effect, each Telebit modem uses the `g' protocol to communicate with the attached computer, but the data between the modems is sent using a Telebit proprietary error correcting protocol. This allows for very high throughput over the Telebit connection, which, because it is half-duplex, would not normally be able to handle the `g' protocol very well at all.
This discussion of the `g' protocol explains how it works, but does not discuss useful error handling techniques. Some discussion of this can be found in Jamie E. Hanrahan's paper (see section Documentation References). A detailed examination of the source code would also be profitable.
The Taylor UUCP code to handle the `g' protocol is in the file
`protg.c'. There are a number of functions; the most important
ones are fgstart
, fgsend_control
, fgsenddata
, and
fgprocess_data
.
All `g' protocol communication is done with packets. Each packet begins with a six byte header. Control packets consist only of the header. Data packets contain additional data.
The header is as follows:
The control byte in the header is composed of three bit fields, referred to here as tt (two bits), xxx (three bits) and yyy (three bits). The complete byte is ttxxxyyy, or (tt << 6) + (xxx << 3) + yyy.
The tt field takes on the following values:
l
. Let the first byte in
the data field be b1
. If b1
is less than 128 (if the most
significant bit of b1
is 0), then there are l - b1
valid
bytes of data in the data field, beginning with the second byte. If
b1 >= 128
, let b2
be the second byte in the data field.
Then there are l - ((b1 & 0x7f) + (b2 << 7))
valid bytes of data
in the data field, beginning with the third byte. In all cases l
bytes of data are sent (and all data bytes participate in the checksum
calculation) but some of the trailing bytes may be dropped by the
receiver. The xxx and yyy fields are described below.
In a data packet (short or not) the xxx field gives the sequence number of the packet. Thus sequence numbers can range from 0 to 7, inclusive. The yyy field gives the sequence number of the last correctly received packet.
Each communication direction uses a window which indicates how many
unacknowledged packets may be transmitted before waiting for an
acknowledgement. The window may range from 1 to 7 packets, and may be
different in each direction. For example, if the window is 3 and the
last packet acknowledged was packet number 6, packet numbers 7, 0 and 1
may be sent but the sender must wait for an acknowledgement before
sending packet number 2. This acknowledgement could come as the
yyy field of a data packet or as the yyy field of a
RJ
or RR
control packet (described below).
Each packet must be transmitted in order (the sender may not skip sequence numbers). Each packet must be acknowledged, and each packet must be acknowledged in order.
In a control packet, the xxx field takes on the following values:
CLOSE
CLOSE
packet is received, a CLOSE
packet should be sent in reply and
the `g' protocol should halt, causing UUCP to enter the final
handshake.
RJ
or NAK
SRJ
RR
or ACK
INITC
INITB
INITA
To compute the checksum, call the control byte (the fifth byte in the
header) c
.
The checksum of a control packet is simply 0xaaaa - c
.
The checksum of a data packet is 0xaaaa - (check ^ c)
(^
denotes exclusive or, as in C), and check
is the
result of the following routine run on the contents of the data field
(every byte in the data field participates in the checksum, even for a
short data packet). Below is the routine used by Taylor UUCP; it is a
slightly modified version of a routine which John Gilmore patched from
G.L. Chesson's original paper. The z
argument points to the
data and the c
argument indicates how much data there is.
int igchecksum (z, c) register const char *z; register int c; { register unsigned int ichk1, ichk2; ichk1 = 0xffff; ichk2 = 0; do { register unsigned int b; /* Rotate ichk1 left. */ if ((ichk1 & 0x8000) == 0) ichk1 <<= 1; else { ichk1 <<= 1; ++ichk1; } /* Add the next character to ichk1. */ b = *z++ & 0xff; ichk1 += b; /* Add ichk1 xor the character position in the buffer counting from the back to ichk2. */ ichk2 += ichk1 ^ c; /* If the character was zero, or adding it to ichk1 caused an overflow, xor ichk2 to ichk1. */ if (b == 0 || (ichk1 & 0xffff) < b) ichk1 ^= ichk2; } while (--c > 0); return ichk1 & 0xffff; }
When the `g' protocol is started, the calling UUCP sends an INITA control packet with the window size it wishes the called UUCP to use. The called UUCP responds with an INITA packet with the window size it wishes the calling UUCP to use. Pairs of INITB and INITC packets are then similarly exchanged. When these exchanges are completed, the protocol is considered to have been started. The window size is sent twice, with both the INITA and the INITC packets.
When a UUCP package transmits a command, it sends one or more data packets. All the data packets will normally be complete, although some UUCP packages may send the last one as a short packet. The command string is sent with a trailing null byte, to let the receiving package know when the command is finished. Some UUCP packages require the last byte of the last packet sent to be null, even if the command ends earlier in the packet. Some packages may require all the trailing bytes in the last packet to be null, but I have not confirmed this.
When a UUCP package sends a file, it will send a sequence of data packets. The end of the file is signalled by a short data packet containing zero valid bytes (it will normally be preceeded by a short data packet containing the last few bytes in the file).
Note that the sequence numbers cover the entire communication session, including both command and file data.
When the protocol is shut down, each UUCP package sends a CLOSE
control packet.
The `f' protocol is a seven bit protocol which checksums an entire
file at a time. It only uses the characters between \040 and \176
(ASCII space and `~') inclusive as well as the carriage return
character. It can be very efficient for transferring text only data,
but it is very inefficient at transferring eight bit data (such as
compressed news). It is not flow controlled, and the checksum is fairly
insecure over large files, so using it over a serial connection requires
handshaking (XON
/XOFF
can be used) and error correcting
modems. Some people think it should not be used even under those
circumstances.
I believe the `f' protocol originated in BSD versions of UUCP. It was originally intended for transmission over X.25 PAD links.
The Taylor UUCP code for the `f' protocol is in `protf.c'.
The `f' protocol has no startup or finish protocol. However, both
sides typically sleep for a couple of seconds before starting up,
because they switch the terminal into XON
/XOFF
mode and
want to allow the changes to settle before beginning transmission.
When a UUCP package transmits a command, it simply sends a string terminated by a carriage return.
When a UUCP package transmits a file, each byte b of the file is translated according to the following table:
0 <= b <= 037: 0172, b + 0100 (0100 to 0137) 040 <= b <= 0171: b ( 040 to 0171) 0172 <= b <= 0177: 0173, b - 0100 ( 072 to 077) 0200 <= b <= 0237: 0174, b - 0100 (0100 to 0137) 0240 <= b <= 0371: 0175, b - 0200 ( 040 to 0171) 0372 <= b <= 0377: 0176, b - 0300 ( 072 to 077)
That is, a byte between \040 and \171 inclusive is transmitted as is, and all other bytes are prefixed and modified as shown.
When all the file data is sent, a seven byte sequence is sent: two bytes of \176 followed by four ASCII bytes of the checksum as printed in base 16 followed by a carriage return. For example, if the checksum was 0x1234, this would be sent: "\176\1761234\r".
The checksum is initialized to 0xffff. For each byte that is sent it is
modified as follows (where b
is the byte before it has been
transformed as described above):
/* Rotate the checksum left. */ if ((ichk & 0x8000) == 0) ichk <<= 1; else { ichk <<= 1; ++ichk; } /* Add the next byte into the checksum. */ ichk += b;
When the receiving UUCP sees the checksum, it compares it against its own calculated checksum and replies with a single character followed by a carriage return.
The sending UUCP checks the returned character and acts accordingly.
The `t' protocol is intended for TCP links. It does no error checking or flow control, and requires an eight bit clear channel.
I believe the `t' protocol originated in BSD versions of UUCP.
The Taylor UUCP code for the `t' protocol is in `prott.c'.
When a UUCP package transmits a command, it first gets the length of the command string, c. It then sends ((c / 512) + 1) * 512 bytes (the smallest multiple of 512 which can hold c bytes plus a null byte) consisting of the command string itself followed by trailing null bytes.
When a UUCP package sends a file, it sends it in blocks. Each block
contains at most 1024 bytes of data. Each block consists of four bytes
containing the amount of data in binary (most significant byte first,
the same format as used by the Unix function htonl
) followed by
that amount of data. The end of the file is signalled by a block
containing zero bytes of data.
The `e' protocol is similar to the `t' protocol. It does no flow control or error checking and is intended for use over TCP.
The `e' protocol originated in versions of HDB UUCP.
The Taylor UUCP code for the `e' protocol is in `prote.c'.
When a UUCP package transmits a command, it simply sends the command as an ASCII string terminated by a null byte.
When a UUCP package transmits a file, it sends the complete size of the file as an ASCII decimal number. The ASCII string is padded out to 20 bytes with null bytes (i.e., if the file is 1000 bytes long, it sends `1000\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0'). It then sends the entire file.
I believe that the `x' protocol was intended for use over X.25 virtual circuits. It relies on a write of zero bytes being read as zero bytes without stopping communication. I have heard that it does not work correctly. If someone would care to fill this in more, I would be grateful. Taylor UUCP does not implement the `x' protocol.
This is apparently used for DataKit connections, and relies on a write of zero bytes being read as zero bytes, much as the `x' protocol does. I don't really know anything else about it. Taylor UUCP does not implement the `d' protocol.
The `G' protocol is apparently simply the `g' protocol, except that it is known to support all possible window and packet sizes. It was introduced by SVR4 UUCP; the SVR4 implementation of the `g' protocol is apparently fixed at a packet size of 64 and a window size of 7. Taylor UUCP does not recognize the `G' protocol. It does support all window and packet sizes for the `g' protocol.
I took a lot of the information from Jamie E. Hanrahan's paper in the Fall 1990 DECUS Symposium, and from Managing UUCP and Usenet by Tim O'Reilly and Grace Todino (with contributions by several other people). The latter includes most of the former, and is published by O'Reilly & Associates, Inc.
Some information is originally due to a Usenet article by Chuck Wegrzyn. The information on the `g' protocol comes partially from a paper by G.L. Chesson of Bell Laboratories, partially from Jamie E. Hanrahan's paper, and partially from source code by John Gilmore. The information on the `f' protocol comes from the source code by Piet Berteema. The information on the `t' protocol comes from the source code by Rick Adams. The information on the `e' protocol comes from a Usenet article by Matthias Urlichs.