This chapter describes various builtin macros for controlling the input
to m4
.
The builtin dnl
reads and discards all characters, up to and
including the first newline:
dnl
and it is often used in connection with define
, to remove the
newline that follow the call to define
. Thus
define(`foo', `Macro `foo'.')dnl A very simple macro, indeed. foo =>Macro foo.
The input up to and including the next newline is discarded, as opposed to the way comments are treated (see section Comments).
Usually, dnl
is immediately followed by an end of line or some
other whitespace. GNU m4
will produce a warning diagnostic if
dnl
is followed by an open parenthesis. In this case, dnl
will collect and process all arguments, looking for a matching close
parenthesis. All predictable side effects resulting from this
collection will take place. dnl
will return no output. The
input following the matching close parenthesis up to and including the
next newline, on whatever line containing it, will still be discarded.
The default quote delimiters can be changed with the builtin
changequote
:
changequote(opt start, opt end)
where start is the new start-quote delimiter and end is the
new end-quote delimiter. If any of the arguments are missing, the default
quotes (`
and '
) are used instead of the void arguments.
The expansion of changequote
is void.
changequote([, ]) => define([foo], [Macro [foo].]) => foo =>Macro foo.
If no single character is appropriate, start and end can be of any length.
changequote([[, ]]) => define([[foo]], [[Macro [[[foo]]].]]) => foo =>Macro [foo].
Changing the quotes to the empty strings will effectively disable the quoting mechanism, leaving no way to quote text.
define(`foo', `Macro `FOO'.') => changequote(, ) => foo =>Macro `FOO'. `foo' =>`Macro `FOO'.'
There is no way in m4
to quote a string containing an unmatched
left quote, except using changequote
to change the current
quotes.
Neither quote string should start with a letter or `_' (underscore), as they will be confused with names in the input. Doing so disables the quoting mechanism.
The default comment delimiters can be changed with the builtin
macro changecom
:
changecom(opt start, opt end)
where start is the new start-comment delimiter and end is
the new end-comment delimiter. If any of the arguments are void, the
default comment delimiters (#
and newline) are used instead of
the void arguments. The comment delimiters can be of any length.
The expansion of changecom
is void.
define(`comment', `COMMENT') => # A normal comment =># A normal comment changecom(`/*', `*/') => # Not a comment anymore =># Not a COMMENT anymore But: /* this is a comment now */ while this is not a comment =>But: /* this is a comment now */ while this is not a COMMENT
Note how comments are copied to the output, much as if they were quoted strings. If you want the text inside a comment expanded, quote the start comment delimiter.
Calling changecom
without any arguments disables the commenting
mechanism completely.
define(`comment', `COMMENT') => changecom => # Not a comment anymore =># Not a COMMENT anymore
The macro
changeword
and all associated functionnality is experimental. It is only available if the--enable-changeword
option was given toconfigure
, at GNUm4
installation time. The functionnality might change or even go away in the future. Do not rely on it. Please direct your comments about it the same way you would do for bugs.
A file being processed by m4
is split into quoted strings, words
(potential macro names) and simple tokens (any other single character).
Initially a word is defined by the following regular expression:
[_a-zA-Z][_a-zA-Z0-9]*
Using changeword
, you can change this regular expression. Relaxing
m4
's lexical rules might be useful (for example) if you wanted to
apply translations to a file of numbers:
changeword(`[_a-zA-Z0-9]+') define(1, 0) =>1
Tightening the lexical rules is less useful, because it will generally make some of the builtins unavailable. You could use it to prevent accidental call of builtins, for example:
define(`_indir', defn(`indir')) changeword(`_[_a-zA-Z0-9]*') esyscmd(foo) _indir(`esyscmd', `ls')
Because m4
constructs its words a character at a time, there
is a restriction on the regular expressions that may be passed to
changeword
. This is that if your regular expression accepts
`foo', it must also accept `f' and `fo'.
changeword
has another function. If the regular expression
supplied contains any bracketed subexpressions, then text outside
the first of these is discarded before symbol lookup. So:
changecom(`/*', `*/') changeword(`#\([_a-zA-Z0-9]*\)') #esyscmd(ls)
m4
now requires a `#' mark at the beginning of every
macro invocation, so one can use m4
to preprocess shell
scripts without getting shift
commands swallowed, and plain
text without losing various common words.
m4
's macro substitution is based on text, while TeX's is based
on tokens. changeword
can throw this difference into relief. For
example, here is the same idea represented in TeX and m4
.
First, the TeX version:
\def\a{\message{Hello}} \catcode`\@=0 \catcode`\\=12 =>@a =>@bye
Then, the m4
version:
define(a, `errprint(`Hello')') changeword(`@\([_a-zA-Z0-9]*\)') =>@a
In the TeX example, the first line defines a macro a
to
print the message `Hello'. The second line defines @ to
be usable instead of \ as an escape character. The third line
defines \ to be a normal printing character, not an escape.
The fourth line invokes the macro a
. So, when TeX is run
on this file, it displays the message `Hello'.
When the m4
example is passed through m4
, it outputs
`errprint(Hello)'. The reason for this is that TeX does
lexical analysis of macro definition when the macro is defined.
m4
just stores the text, postponing the lexical analysis until
the macro is used.
You should note that using changeword
will slow m4
down
by a factor of about seven.
It is possible to `save' some text until the end of the normal input has
been seen. Text can be saved, to be read again by m4
when the
normal input has been exhausted. This feature is normally used to
initiate cleanup actions before normal exit, e.g., deleting temporary
files.
To save input text, use the builtin m4wrap
:
m4wrap(string, ...)
which stores string and the rest of the arguments in a safe place, to be reread when end of input is reached.
define(`cleanup', `This is the `cleanup' actions. ') => m4wrap(`cleanup') => This is the first and last normal input line. =>This is the first and last normal input line. ^D =>This is the cleanup actions.
The saved input is only reread when the end of normal input is seen, and
not if m4exit
is used to exit m4
.
It is safe to call m4wrap
from saved text, but then the order in
which the saved text is reread is undefined. If m4wrap
is not used
recursively, the saved pieces of text are reread in the opposite order
in which they were saved (LIFO--last in, first out).