Charspace lets you add side bearings (the blank spaces on either side of a character) to a bitmap font. This is necessary because scanned images typically do not include side bearing information, and therefore Imageto (see section Imageto) cannot determine it.
The input is a bitmap (GF or PK) font, together with one or more CMI files (see section CMI files), which specify character metric information. If a corresponding TFM file exists, it is read to get default values for the character dimensions (Charspace promptly overwrites the widths). The output is a TFM file and (typically) a revised GF file with the new width information.
The basic idea for Charspace came from Harry Smith, via Walter Tracy's book Letters of Credit. See `charspace/README' for the full citation.
Charspace makes no attempt to be intelligent about the side bearings it computes; it just follows the instructions in the CMI files.
The CMI files must be created by human hands, since the information they contain usually cannot be determined automatically. See the next section for the details on what CMI files contain.
We supply one CMI file, `common.cmi' (distributed in the `data' directory), which defines more-or-less typeface-independent definitions for most common characters. Charspace reads `common.cmi' before any of the CMI files you supply, so your definitions override its.
`common.cmi' can be used for all typefaces because its definitions
are entirely symbolic; therefore, your CMI file must define actual
values for the identifiers it uses. For example, `common.cmi'
defines the right side bearing of `K' to be uc-min-sb
; you
yourself must define uc-min-sb
.
You must also define side bearings for characters not in `common.cmi'. And you can redefine side bearings that are in `common.cmi', if you find its definitions unsuitable.
Once you have prepared a CMI file, you can run Charspace, e.g.:
charspace -verbose -encoding=enc-file fontname.dpi \ -output-file=out-fontname
where enc-file specifies the encoding, fontname the input font, dpi the resolution, and out-fontname the name of the output font.
With these options, Charspace will write files `out-fontname.tfm' and `out-fontname.dpigf'. You can then run TeX on `testfont.tex', telling TeX to use the font out-fontname. This produces a DVI file which you can print or preview as you usually do with TeX documents.
This will probably reveal problems in your CMI file, e.g., the spacing for some characters or character combinations will be poor. So you need to iterate.
However, if you are planning to eventually run your bitmap font through Limn (see section Limn) and BZRto (see section BZRto) to make an outline font, there's little point in excessively fine-tuning the spacing of the original bitmap font. The reason is that the generated outline font will inevitably rasterize differently than the original bitmaps, and the change in character shapes will almost certainly affect the spacing.
Character metric information (CMI) files are free-format text files which (primarily) describe the side bearings for characters in a font. Side bearings are the blank spaces to the left and right of a character which makeprinted type easier to read, as well as more pleasing visually.
In addition to side bearing definitions, CMI files can also contain kerns, which insert or remove space between particular letter pairs; and font dimensions, global information about the font which is stored in the TFM file (see section TFM fontdimens).
If your font is named `foo.300gf' (or `... pk'), it is customary to name the corresponding CMI file `foo.300cmi'. That is what Charspace looks for by default. If you name it something else, you must use the `-cmi-files' option to tell Charspace its name. It is reasonable to use the resolution as part of the CMI filename, since the values written in it are (for the most part) in pixels.
See section Common file syntax, for a precise description of syntax elements common to all data files processed by these programs, including comments.
In the following sections, we describe the individual commands, the tokens that comprise them, and the way Charspace processes them.
Tokens in a CMI file are one of the following.
isspace
)
not listed above, and terminated by a whitespace character.
In some contexts, an identifier is taken as a character name---a name
from the encoding file Charspace is using, either the default or one you
specified with `-encoding' (see section Invoking Charspace).
See section Encoding files, for the definition of encoding files.
In all other cases, identifiers are internal to Charspace. The particular
commands describe the semantics which apply to them.
Some identifiers are reserved, i.e., they cannot be used in any
context except as described in the following sections. Reserved words
are always shown in typewriter type.
An expression in a CMI file is one of: a number, an identifier, or a number followed by an identifier. This last, as in `.75 foo', denotes multiplication.
char
command
The char
command specifies both side bearings for a single
character. It has the form:
char charname expr1 , expr2
where:
width
(charname)}. If these
expressions contain identifiers, the values of those identifiers are not
resolved until after Charspace has read all the CMI files.
Giving the side bearings symbolically is useful when the character definition is intended to be used for more than one typeface. For example, `common.cmi' (see section Charspace usage) contains:
char K H-sb , uc-min-sb char L H-sb , uc-min-sb
Then the CMI file you write for a particular font can define H-sb
and uc-min-sb
, and not have to redefine the side bearings for
K
and L
.
char-width
command
The char-width
command specifies the set width and left side
bearing as a percentage of the total remaining space for a single
character. It has the form:
char-width charname width-expr , lsb-%-expr
where:
The char-width
command is useful when you want a character to
have a particular set width, since it's much simpler to specify that
width and the left side bearing (and let the program compute the right
side bearing) than to somehow estimate the bitmap width and then choose
the side bearings to add up to the desired set width.
For example, in most fonts, the numerals all have the same width, to
ease typesetting of columns of them in tables. Thus, `common.cmi'
defines eight
(the name for the numeral `8') as follows:
char-width eight numeral-width , eight-lsb-percent
Since the numeral width is traditionally one-half the em width of
the font, `common.cmi' defines numeral-width
as
enspace
, which in turn is defined to be half the quad
fontdimen.
eight-lsb-percent
is defined to be `.5', thus centering the
`8'.
The other numerals are also defined to have width numeral-width
,
but the lsb-percent
s vary according to the character shapes.
define
command
The define
command defines an identifier as a number. This is
useful to give a symbolic name to a constant used in more than one
character or fontdimen definition, for ease of change.
It has the form:
define id expr
The identifier id is defined to be the expression expr. Any
previous definition of id is replaced. The id can be used
prior to the define
command; Charspace doesn't try to resolve any
definitions in the CMI files until after all files have been read.
kern
command
The kern
command defines a space to insert or remove between two
particular characters. The kerning information is written only to the
TFM file. It has the form:
kern name1 name2 expr
where name1 and name2 are character names, as in
the char
command (see section char
command), and expr is the
amount of the kern in pixels.
For example:
kern F dot -7.5
would put an entry in the TFM file's kerning table such that when TeX typesets a `F' followed by a `.', it inserts an additional space equivalent to @math{-7.5} pixels in the resolution of Charspace's input font, i.e., it moves the two characters closer together.
codingscheme
command
The codingscheme
command defines the encoding scheme to be used
for the output files. (See section Encoding files, for a full description of
font encodings.) It has the form:
codingscheme string-constant
where string-constant is a coding scheme string; for example, `"GNU Latin text"'. This string is looked up in the data file `encoding.map' to find the name of the corresponding encoding file (see section Coding scheme map file).
fontdimen
command
The fontdimen
command defines a font parameter to be put in the
TFM file. It has the form:
fontdimen fontdimen-name expr
where fontdimen-name is any of the fontdimen names listed in the section below, and expr gives the new value of the fontdimen, in pixels.
For example, `common.cmi' (see section Charspace usage) makes the following definitions:
fontdimen quad designsize fontdimen space .333 quad
This defines the fontdimen quad
, which determines the
width of the em
dimension in TeX, to be the same as the design
size of the font. (This is traditionally the case, although it is not a
hard-and-fast rule.) Then it defines the fontdimen space
, which
is the normal interword space in TeX, to be one-third of the quad.
Because of the way that Charspace processes the CMI files
(see section CMI processing), if you redefine the quad
fontdimen in
another CMI file, the value of space
will change correspondingly.
The section below lists all the TFM fontdimen names Charspace recognizes, and their meaning to TeX.
This section lists all the TFM fontdimens recognized by these programs: all those recognized by TeX, plus a few others we thought would prove useful when writing TeX macros.
A fontdimen is an arbitrary number, in all cases but one
(slant
, see below) measured in printer's points, which is
associated with a particular font. Their values are stored in the TFM
file for the font. We also refer, context permitting, to fontdimens as
"font parameters", or simply "parameters".
Fontdimens affect many aspects of TeX's behavior: the interword spacing, accent placement, and math formula construction. The math fontdimens in particular are fairly obscure; if you don't have a firm grasp on how TeX constructs math formulas, the explanations below will probably be meaningless to you, and--unless you're making a font for math typesetting--can be ignored.
The `common.cmi' file which Charspace reads sets reasonable defaults for the fontdimens relevant to normal text typesetting.
When TeX (or other programs) scale a font, its fontdimen values are scaled proportionally to the design size. For example, suppose the designsize of some font f is 10pt, and some fontdimen in f has the value 7.5pt. Then if the font is used scaled to 20pt, the fontdimen's value is scaled to 15pt.
You can get the table of fontdimen values in a particular TFM file by running the standard TeX utility program PLtoTF and inspecting its (human-readable text) output.
In our programs and in PLtoTF, fontdimens are typically shown by their names. But each also has a number, starting at 1. You can use either the number or the name on the command line (in the argument to the `-fontdimens' option). The numbers are given in parentheses after the name in the table below.
In a few cases (fontdimens 8--13), the same number fontdimen has two different names, and two different meanings. This does not cause problems in practice, because these fontdimens are used only in the TeX math symbol and math extension fonts, which TeX can distinguish via its "math families" (see The TeXbook for the details).
slant (1)
slant
parameter is not scaled
with the font when it is loaded. It defines the "slant per pt" of the
font; for example, a slant
of 0.2 means a 1pt-high
character stem would end 0.2pt to the right of where it began.
This value is typical for slanted or italic fonts; for normal upright
fonts, slant
is zero, naturally. TeX uses this to position
accents.
space (2)
space
parameter defines the normal interword space of the
font. This is typically about one-third of the design size, but it
varies according to the type design: a narrow, spiky typeface will
have a small interword space relative to a wide, regular one.
Exception: in math fonts, the interword space is zero.
stretch (3)
stretch
parameter defines the interword stretch of the font.
This is typically about one-half of the space
parameter. TeX
is reluctant to increase interword spacing beyond the width
@math{space
+ stretch
}. In monospaced fonts, the stretch
is typically zero.
shrink (4)
shrink
parameter defines the interword shrink of the font.
This is typically about one-third of the space
parameter. TeX
does not decrease interword spacing beyond the width @math{space
- shrink
}. In monospaced fonts, the shrink is typically zero.
xheight (5)
xheight
parameter defines the x-height of the font, i.e., the
main body size. The height of the lowercase `x' is often used for this,
since neither the top nor the bottom of `x' are curves. There is no
hard-and-fast rule in TeX that the x-height must equal the height of
`x', however.
This fontdimen defines the value of the ex
dimension in TeX.
TeX also uses this to position: it assumes the accents in the font
are properly positioned over a character that is exactly 1ex high.
quad (6)
quad
fontdimen defines the value of the em
dimension
in TeX. This is often the same as the design size of the font, but
as usual, that's not an absolute requirement.
Typesetters often use em
s and ex
s instead of hardwiring
dimensions in terms of (say) points; that way, experimenting with
different fonts for a particular job does not require changing the
dimensions.
extraspace (7)
extraspace
fontdimen defines the space TeX puts at the end
of sentence. (Technically, when the \spacefactor
is 20000 or
more.) This is typically about one-sixth of the normal interword space.
num1 (8)
num2 (9)
num3 (10)
denom1 (11)
denom2 (12)
sup1 (13)
sup2 (14)
sup3 (15)
sub1 (16)
sub2 (17)
supdrop (18)
subdrop (19)
delim1 (20)
delim2 (21)
axisheight (22)
defaultrulethickness (8)
bigopspacing1 (9)
bigopspacing2 (10)
bigopspacing3 (11)
bigopspacing4 (12)
bigopspacing5 (13)
leadingheight (23)
leadingheight
parameter defines the height component of the
recommended leading for this font. Leading is the
baseline-to-baseline distance when setting lines of type.
TeX does not automatically use this fontdimen, and the standard
TeX fonts do not define it, but you may wish to include it in new
fonts for the benefit of future TeX macro. This fontdimen is a GNU
extension.
leadingdepth (24)
leadingdepth
parameters defines the depth of the recommended
leading for this font. See leadingheight
directly above. This
fontdimen is a GNU extension.
fontsize (25)
fontsize
parameter is the design size of the font. This is
needed for TeX macros to find the font's design size. This fontdimen
is a GNU extension.
version (26)
version
parameter identifies a particular version of the TFM
file. Whenever the character dimensions, kerns, or ligature table for a
font changes, it is good to increment the version number. It is also good
to keep such changes to a minimum, since they can change the line breaks
and page breaks in documents typeset with previous versions. This
fontdimen is a GNU extension.
Here are some further details on how Charspace processes the CMI files:
define foo bar define bar 1.0 char A foo , baris valid, and defines both side bearings of `A' to be 1.0. (See the preceding sections for the definition of the various commands allowed in CMI files.)
define foo barwill elicit no complaint, if `foo' is not needed to make the output files.
define bar 100 define foo 2 bar define bar 1 char A foo , foodefines both side bearings of `A' to be 2, not 200.
designsize
, to be the
design size of the input font (in pixels). It can be redefined like any
other identifier.
If you can read programs in the C language, you may find it instructive to examine the implementation of CMI file processing in the source files `charspace/char.c' and `charspace/cmi.y'. The source provides the full details of CMI processing.
This section describes the options that Charspace accepts. See section Command-line options, for general option syntax.
The root of the main input fontname is called font-name below.
codingscheme
command
(see section codingscheme
command).
If a TFM file `font-name.tfm' exists, it is also read for
default ligature, headerbyte, and fontdimen information. Definitions in
the CMI files override those in such a TFM file.
xheight
fontdimen
(see section TFM fontdimens); default is 120 (ASCII `x'). (It is
reasonable to use 120 instead of whatever `x' is in the underlying
character set because most font encoding schemes are based on ASCII
regardless of the host computer's character set.)