It is my hope that other people will figure out smart stuff that Gnus can do, and that other people will write those smart things as well. To facilitate that I thought it would be a good idea to describe the inner workings of Gnus. And some of the not-so-inner workings, while I'm at it.
You can never expect the internals of a program not to change, but I will be defining (in some details) the interface between Gnus and its backends (this is written in stone), the format of the score files (ditto), data structures (some are less likely to change than others) and general method of operations.
Gnus doesn't know anything about nntp, spools, mail or virtual groups.
It only knows how to talk to virtual servers. A virtual server is
a backend and some backend variables. As examples of the
first, we have nntp
, nnspool
and nnmbox
. As
examples of the latter we have nntp-port-number
and
nnmbox-directory
.
When Gnus asks for information from a backend -- say nntp
-- on
something, it will normally include a virtual server name in the
function parameters. (If not, the backend should use the "current"
virtual server.) For instance, nntp-request-list
takes a virtual
server as its only (optional) parameter. If this virtual server hasn't
been opened, the function should fail.
Note that a virtual server name has no relation to some physical server name. Take this example:
(nntp "odd-one" (nntp-address "ifi.uio.no") (nntp-port-number 4324))
Here the virtual server name is `"odd-one"' while the name of the physical server is `"ifi.uio.no"'.
The backends should be able to switch between several virtual servers. The standard backends implement this by keeping an alist of virtual server environments that it pulls down/pushes up when needed.
There are two groups of interface functions: required functions, which must be present, and optional functions, which Gnus will always check whether are present before attempting to call.
All these functions are expected to return data in the buffer
nntp-server-buffer
(`" *nntpd*"'), which is somewhat
unfortunately named, but we'll have to live with it. When I talk about
"resulting data", I always refer to the data in that buffer. When I
talk about "return value", I talk about the function value returned by
the function call.
Some backends could be said to be server-forming backends, and some might be said to not be. The latter are backends that generally only operate on one group at a time, and have no concept of "server" -- they have a group, and they deliver info on that group and nothing more.
In the examples and definitions I will refer to the imaginary backend
nnchoke
.
(nnchoke-retrieve-headers ARTICLES &optional GROUP SERVER)
Message-ID
s. Current backends do not fully support either - only
sequences (lists) of article numbers, and most backends do not support
retrieval of Message-ID
s. But they should try for both.
The result data should either be HEADs or NOV lines, and the result
value should either be headers
or nov
to reflect this.
This might later be expanded to various
, which will be a mixture
of HEADs and NOV lines, but this is currently not supported by Gnus.
Here's an example HEAD:
221 1056 Article retrieved. Path: ifi.uio.no!sturles From: sturles@ifi.uio.no (Sturle Sunde) Newsgroups: ifi.discussion Subject: Re: Something very droll Date: 27 Oct 1994 14:02:57 +0100 Organization: Dept. of Informatics, University of Oslo, Norway Lines: 26 Message-ID: <38o8e1$a0o@holmenkollen.ifi.uio.no> References: <38jdmq$4qu@visbur.ifi.uio.no> NNTP-Posting-Host: holmenkollen.ifi.uio.no .So a
headers
return value would imply that there's a number of
these in the data buffer.
Here's a BNF definition of such a buffer:
headers = *head head = error / valid-head error-message = [ "4" / "5" ] 2number " " <error message> eol valid-head = valid-message *header "." eol valid-message = "221 " <number> " Article retrieved." eol header = <text> eolIf the return value is
nov
, the data buffer should contain
network overview database lines. These are basically fields
separated by tabs.
nov-buffer = *nov-line nov-line = 8*9 [ field <TAB> ] eol field = <text except TAB>For a closer explanation what should be in those fields, See section Headers.
(nnchoke-open-server SERVER &optional DEFINITIONS)
(VARIABLE VALUE)
pairs that defines this virtual server.
If the server can't be opened, no error should be signaled. The backend
may then choose to refuse further attempts at connecting to this
server. In fact, it should do so.
If the server is opened already, this function should return a
non-nil
value. There should be no data returned.
(nnchoke-close-server &optional SERVER)
(nnchoke-request-close)
nntp-server-buffer
, though.)
There should be no data returned.
(nnchoke-server-opened &optional SERVER)
(nnchoke-status-message &optional SERVER)
(nnchoke-request-article ARTICLE &optional GROUP SERVER TO-BUFFER)
Message-ID
or a number.
It is optional whether to implement retrieval by Message-ID
, but
it would be nice if that were possible.
If to-buffer is non-nil
, the result data should be returned
in this buffer instead of the normal data buffer. This is to make it
possible to avoid copying large amounts of data from one buffer to
another, and Gnus mainly request articles to be inserted directly into
its article buffer.
(nnchoke-open-group GROUP &optional SERVER)
(nnchoke-request-group GROUP &optional SERVER)
211 56 1000 1059 ifi.discussionThe first number is the status, which should be `211'. Next is the total number of articles in the group, the lowest article number, the highest article number, and finally the group name. Note that the total number of articles may be less than one might think while just considering the highest and lowest article numbers, but some articles may have been cancelled. Gnus just discards the total-number, so whether one should take the bother to generate it properly (if that is a problem) is left as an excercise to the reader.
group-status = [ error / info ] eol error = [ "4" / "5" ] 2<number> " " <Error message> info = "211 " 3* [ <number> " " ] <string>
(nnchoke-close-group GROUP &optional SERVER)
(nnchoke-request-list &optional SERVER)
ifi.test 0000002200 0000002000 y ifi.discussion 3324 3300 nOn each line we have a group name, then the highest article number in that group, the lowest article number, and finally a flag.
active-file = *active-line active-line = name " " <number> " " <number> " " flags eol name = <string> flags = "n" / "y" / "m" / "x" / "j" / "=" nameThe flag says whether the group is read-only (`n'), is moderated (`m'), is dead (`x'), is aliased to some other group (`=other-group' or none of the above (`y').
(nnchoke-request-post &optional SERVER)
(nnchoke-request-post-buffer POST GROUP SUBJECT HEADER ARTICLE-BUFFER INFO FOLLOW-TO RESPECT-POSTER)
nnchoke-request-post
. If post is
non-nil
, this is not a followup, but a totally new article.
group is the name of the group to be posted to. subject is
the subject of the message. article-buffer is the buffer being
followed up, if that is the case. info is the group info.
follow-to is the group that one is supposed to re-direct the
article to. If respect-poster is non-nil
, the special
`"poster"' value of a Followup-To
header is to be respected.
There should be no result data returned.
(nnchoke-retrieve-groups GROUPS &optional SERVER)
active
or
group
, which says what the format of the result data is. The
former is in the same format as the data from
nnchoke-request-list
, while the latter is a buffer full of lines
in the same format as nnchoke-request-group
gives.
group-buffer = *active-line / *group-status
(nnchoke-request-update-info GROUP INFO &optional SERVER)
(nnchoke-request-scan &optional GROUP SERVER)
(nnchoke-request-asynchronous GROUP &optional SERVER ARTICLES)
(nnchoke-request-group-description GROUP &optional SERVER)
description-line = name <TAB> description eol name = <string> description = <text>
(nnchoke-request-list-newsgroups &optional SERVER)
description-buffer = *description-line
(nnchoke-request-newgroups DATE &optional SERVER)
(nnchoke-request-create-groups GROUP &optional SERVER)
(nnchoke-request-expire-articles ARTICLES &optional GROUP SERVER FORCE)
nil
, all articles should be deleted, no matter how new
they are.
This function should return a list of articles that it did not/was not
able to delete.
There should be no result data returned.
(nnchoke-request-move-article ARTICLE GROUP SERVER ACCEPT-FORM
eval
accept-form in the buffer where the "tidy" article is. This will
do the actual copying. If this eval
returns a non-nil
value, the article should be removed.
If last is nil
, that means that there is a high likelihood
that there will be more requests issued shortly, so that allows some
optimizations.
There should be no data returned.
(nnchoke-request-accept-article GROUP &optional LAST)
nil
, that means that there will be more calls to
this function in short order.
There should be no data returned.
(nnchoke-request-replace-article ARTICLE GROUP BUFFER)
Score files are meant to be easily parsable, but yet extremely mallable. It was decided that something that had the same read syntax as an Emacs Lisp list would fit that spec.
Here's a typical score file:
(("summary" ("win95" -10000 nil s) ("Gnus")) ("from" ("Lars" -1000)) (mark -100))
BNF definition of a score file:
score-file = "" / "(" *element ")" element = rule / atom rule = string-rule / number-rule / date-rule string-rule = "(" quote string-header quote space *string-match ")" number-rule = "(" quote number-header quote space *number-match ")" date-rule = "(" quote date-header quote space *date-match ")" quote = <ascii 34> string-header = "subject" / "from" / "references" / "message-id" / "xref" / "body" / "head" / "all" / "followup" number-header = "lines" / "chars" date-header = "date" string-match = "(" quote <string> quote [ "" / [ space score [ "" / space date [ "" / [ space string-match-t ] ] ] ] ] ")" score = "nil" / <integer> date = "nil" / <natural number> string-match-t = "nil" / "s" / "substring" / "S" / "Substring" / "r" / "regex" / "R" / "Regex" / "e" / "exact" / "E" / "Exact" / "f" / "fuzzy" / "F" / "Fuzzy" number-match = "(" <integer> [ "" / [ space score [ "" / space date [ "" / [ space number-match-t ] ] ] ] ] ")" number-match-t = "nil" / "=" / "<" / ">" / ">=" / "<=" date-match = "(" quote <string> quote [ "" / [ space score [ "" / space date [ "" / [ space date-match-t ] ] ] ] ")" date-match-t = "nil" / "at" / "before" / "after" atom = "(" [ required-atom / optional-atom ] ")" required-atom = mark / expunge / mark-and-expunge / files / exclude-files / read-only / touched optional-atom = adapt / local / eval mark = "mark" space nil-or-number nil-or-t = "nil" / <integer> expunge = "expunge" space nil-or-number mark-and-expunge = "mark-and-expunge" space nil-or-number files = "files" *[ space <string> ] exclude-files = "exclude-files" *[ space <string> ] read-only = "read-only" [ space "nil" / space "t" ] adapt = "adapt" [ space "nil" / space "t" / space adapt-rule ] adapt-rule = "(" *[ <string> *[ "(" <string> <integer> ")" ] ")" local = "local" *[ space "(" <string> space <form> ")" ] eval = "eval" space <form> space = *[ " " / <TAB> / <NEWLINE> ]
Any unrecognized elements in a score file should be ignored, but not discarded.
As you can see, white space is needed, but the type and amount of white space is irrelevant. This means that formatting of the score file is left up to the programmer -- if it's simpler to just spew it all out on one looong line, then that's ok.
The meaning of the various atoms are explained elsewhere in this manual.
Gnus uses internally a format for storing article headers that corresponds to the NOV format in a mysterious fashion. One could almost suspect that the author looked at the NOV specification and just shamelessly stole the entire thing, and one would be right.
Header is a severly overloaded term. "Header" is used in RFC1036
to talk about lines in the head of an article (eg., From
). It is
used by many people as a synonym for "head" -- "the header and the
body". (That should be avoided, in my opinion.) And Gnus uses a format
interanally that it calls "header", which is what I'm talking about
here. This is a 9-element vector, basically, with each header (ouch)
having one slot.
These slots are, in order: number
, subject
, from
,
date
, id
, references
, chars
, lines
,
xref
. There are macros for accessing and setting these slots --
they all have predicatable names beginning with mail-header-
and
mail-header-set-
, respectively.
The xref
slot is really a misc
slot. Any extra info will
be put in there.
GNUS introduced a concept that I found so useful that I've started using it a lot and have elaborated on it greatly.
The question is simple: If you have a large amount of objects that are identified by numbers (say, articles, to take a wild example) that you want to callify as being "included", a normal sequence isn't very useful. (A 200,000 length sequence is a bit long-winded.)
The solution is as simple as the question: You just collapse the sequence.
(1 2 3 4 5 6 10 11 12)
is transformed into
((1 . 6) (10 . 12))
To avoid having those nasty `(13 . 13)' elements to denote a lonesome object, a `13' is a valid element:
((1 . 6) 7 (10 . 12))
This means that comparing two ranges to find out whether they are equal is slightly tricky:
((1 . 6) 7 8 (10 . 12))
and
((1 . 5) (7 . 8) (10 . 12))
are equal. In fact, any non-descending list is a range:
(1 2 3 4 5)
is a perfectly valid range, although a pretty longwinded one. This is also legal:
(1 . 5)
and is equal to the previous range.
Here's a BNF definition of ranges. Of course, one must remember the semantic requirement that the numbers are non-descending. (Any number of repetition of the same number is allowed, but apt to disappear in range handling.)
range = simple-range / normal-range simple-range = "(" number " . " number ")" normal-range = "(" start-contents ")" contents = "" / simple-range *[ " " contents ] / number *[ " " contents ]
Gnus currently uses ranges to keep track of read articles and article marks. I plan on implementing a number of range operators in C if The Powers That Be are willing to let me. (I haven't asked yet, because I need to do some more thinking on what operators I need to make life totally range-based without ever having to convert back to normal sequences.)
Gnus stores all permanent info on groups in a group info list. This list is from three to six elements (or more) long and exhaustively describes the group.
Here are two example group infos; one is a very simple group while the second is a more complex one:
("no.group" 5 (1 . 54324)) ("nnml:my.mail" 3 ((1 . 5) 9 (20 . 55)) ((tick (15 . 19)) (replied 3 6 (19 . 23))) (nnml "") (auto-expire (to-address "ding@ifi.uio.no")))
The first element is the group name as Gnus knows the group; the second is the group level; the third is the read articles in range format; the fourth is a list of article marks lists; the fifth is the select method; and the sixth contains the group parameters.
Here's a BNF definition of the group info format:
info = "(" group space level space read [ "" / [ space marks-list [ "" / [ space method [ "" / space parameters ] ] ] ] ] ")" group = quote <string> quote level = <integer in the range of 1 to inf> read = range marks-lists = nil / "(" *marks ")" marks = "(" <string> range ")" method = "(" <string> *elisp-forms ")" parameters = "(" *elisp-forms ")"
Actually that `marks' rule is a fib. A `marks' is a `<string>' consed on to a `range', but that's a bitch to say in pseudo-BNF.