In Lisp, atoms are recorded in a straightforward fashion; if the
implementation is not straightforward in practice, it is, nonetheless,
straightforward in theory. The atom `rose', for example, is
recorded as the four contiguous letters `r', `o', `s',
`e'. A list, on the other hand, is kept differently. The mechanism
is equally simple, but it takes a moment to get used to the idea. A
list is kept using a series of pairs of pointers. In the series, the
first pointer in each pair points to an atom or to another list, and the
second pointer in each pair points to the next pair, or to the symbol
nil
, which marks the end of the list.
A pointer itself is quite simply the electronic address of what is pointed to. Hence, a list is kept as a series of electronic addresses.
For example, the list (rose violet buttercup)
has three elements,
`rose', `violet', and `buttercup'. In the computer, the
electronic address of `rose' is recorded in a segment of computer
memory along with the address that gives the electronic address of where
the atom `violet' is located; and that address (the one that tells
where `violet' is located) is kept along with an address that tells
where the address for the atom `buttercup' is located.
This sounds more complicated than it is and is easier seen in a diagram:
___ ___ ___ ___ ___ ___ |___|___|--> |___|___|--> |___|___|--> nil | | | | | | --> rose --> violet --> buttercup
In the diagram, each box represents a word of computer memory that
holds a Lisp object, usually in the form of a memory address. The boxes,
i.e. the addresses, are in pairs. Each arrow points to what the address
is the address of, either an atom or another pair of addresses. The
first box is the electronic address of `rose' and the arrow points
to `rose'; the second box is the address of the next pair of boxes,
the first part of which is the address of `violet' and the second
part of which is the address of the next pair. The very last box
points to the symbol nil
, which marks the end of the list.
When a variable is set to a list with a function such as setq
,
it stores the address of the first box in the variable. Thus,
evaluation of the expression
(setq bouquet '(rose violet buttercup))
creates a situation like this:
bouquet | | ___ ___ ___ ___ ___ ___ --> |___|___|--> |___|___|--> |___|___|--> nil | | | | | | --> rose --> violet --> buttercup
In this case, the symbol bouquet
holds the address of the first
pair of boxes. Indeed, the symbol bouquet
consists of a group
of address-boxes, one of which is the address of the printed word
`bouquet', a second of which is the address of a function
definition attached to the symbol, if any, a third of which is the
address of the first pair of address-boxes for the list (rose
violet buttercup)
, and so on.
This same list can be illustrated in a different sort of box notation like this:
bouquet | | -------------- --------------- ---------------- | | car | cdr | | car | cdr | | car | cdr | -->| rose | o------->| violet | o------->| butter- | nil | | | | | | | | cup | | -------------- --------------- ----------------
In an earlier section, I suggested that you might imagine a symbol as being a chest of drawers. The function definition is put in one drawer, the value in another, and so on. What is put in the drawer holding the value can be changed without affecting the contents of the drawer holding the function definition, and vice-versa. Actually, what is put in each drawer is the address of the value or function definition. It is as if you found an old chest in the attic, and in one of its drawers you found a map giving you directions to where the buried treasure lies.
(In addition to its name, symbol definition, and variable value, a symbol has a `drawer' for a property list which can be used to record other information. Property lists are not discussed here; see section `Property Lists' in The GNU Emacs Lisp Reference Manual.)
Here is a fanciful representation:
Chest of Drawers Contents of Drawers --------------------- | | | symbol name | bouquet | | --------------------- | | | symbol definition | [none] | | --------------------- | | | variable value | (rose violet buttercup) | | --------------------- | | | property list | [not described here] | | --------------------- |/ \|
If symbol is set to the CDR of a list, the list itself is not changed; the symbol simply has an address further down the list. (In the jargon, CAR and CDR are `non-destructive'.) Thus, evaluation of the following expression
(setq flowers (cdr bouquet))
produces this:
bouquet flowers | | | ___ ___ | ___ ___ ___ ___ --> | | | --> | | | | | | |___|___|----> |___|___|--> |___|___|--> nil | | | | | | --> rose --> violet --> buttercup
The value of flowers
is (violet buttercup)
, which is
to say, the symbol flowers
holds the address of the pair of
address-boxes, the first of which holds the address of violet
,
and the second of which holds the address of buttercup
.
A pair of address-boxes is called a cons cell or dotted pair. See section `List Type' in The GNU Emacs Lisp Reference Manual, and section `Dotted Pair Notation' in The GNU Emacs Lisp Reference Manual, for more information about cons cells and dotted pairs.
The function cons
adds a new pair of addresses to the front of
a series of addresses like that shown above. For example, evaluating
the expression
(setq bouquet (cons 'lilly bouquet))
produces:
bouquet flowers | | | ___ ___ ___ ___ | ___ ___ ___ ___ --> | | | | | | --> | | | | | | |___|___|----> |___|___|----> |___|___|---->|___|___|--> nil | | | | | | | | --> lilly --> rose --> violet --> buttercup
However, this does not change the value of the symbol
flowers
, as you can see by evaluating the following,
(eq (cdr (cdr bouquet)) flowers)
which returns t
for true.
Until it is reset, flowers
still has the value
(violet buttercup)
; that is, it has the address of the cons
cell whose first address is of violet
. Also, this does not
alter any of the pre-existing cons cells; they are all still there.
Thus, in Lisp, to get the CDR of a list, you just get the address
of the next cons cell in the series; to get the CAR of a list,
you get the address of the first element of the list; to cons
a
new element on a list, you add a new cons cell to the front of the list.
That is all there is to it! The underlying structure of Lisp is
brilliantly simple!
And what does the last address in a series of cons cells refer to? It
is the address of the empty list, of nil
.
In summary, when a Lisp variable is set to a value, it is provided with the address of the list to which the variable refers.
Set flowers
to violet
and buttercup
. Cons two
more flowers on to this list and set this new list to
more-flowers
. Set the CAR of flowers
to a fish.
What does the more-flowers
list now contain?