HTML-PRETTY 1 "04 December 1997" "Version 1.00" [section 4 of 14]

.-3[NAME] .-2[SYNOPSIS] .-1[DESCRIPTION]
Top
.+1[FORMATTING CONVENTIONS] .+2[STYLE FILES] .+3[CATALOG DIRECTORY]

OPTIONS

Command-line options affect all following filenames. Option values are always provided as separate arguments following the option name. Letter case in option names is not significant, although it may be in option values. Option names may be abbreviated to any unique leading prefix, unless a shorter prefix is documented.

Any argument that begins with a hyphen is expected to be an option, and will raise an error if it is not recognized. If a filename begins with a hyphen, you therefore need to disguise it by supplying a leading directory path. For example, ./-foo represents the file named -foo in the current directory in UNIX.

GNU- and POSIX-style options of the form --name are also recognized: they begin with two option prefix characters.

For every option -xxx that sets a Boolean flag to be acted upon later, there is a corresponding -no-xxx option to override it. The negative forms are normally not required, but are necessary to counteract options set in a style file.

-author

Show author information on stderr and quit after processing any remaining command-line options that precede the next command-line HTML file.

-blank-line-warning

Unlike most word processing and typesetting systems, and ordinary typewritten text, in HTML and SGML, blank lines do not imply a paragraph break, except in verbatim environments. You can select this option to get warnings about such blank lines. They will be reduced to a single blank line, leaving space for later manual insertion of a <P> tag, if that was indeed what was intended [default: -no-blank-line-warning].

If you specify the -convert-paragraph-breaks option, then this option will have no effect.

-brief

Brief mode: exclude the standard boiler-plate wrapper of !DOCTYPE, HTML, HEAD, and BODY tags [default: -no-brief].

This option, in conjunction with -no-comment-banner, is convenient for prettyprinting small fragments of text, such as from inside a text editor. It can also be used for portions of larger documents (e.g., chapters of a book) that are not in themselves complete HTML files.

This option can be abbreviated to -b.

-catalogfile filename

Specify an alternate catalog file to override the default one. See the CATALOG DIRECTORY section for more details.

This option can be abbreviated to -c.

-check-tag-nesting

Use built-in tables, or tables from style files, of tag-is-contained-in and tag-cannot-contain relations to check that tags are properly nested. The rules that govern this are very complex, and depend on the grammar level chosen. The precise relations are tabulated in the HTML GRAMMAR CONSTRAINTS section, and the way they are provided to html-pretty is described in the STYLE FILES section.

Because this option can lead to a large number of warning messages when optional end tags are omitted, the default is -no-check-tag-nesting.

Nevertheless, when a proper SGML parser, such as html-check(1) or html-ncheck(1), is not available, this option can be very helpful in diagnosing errors of tag usage.

-comment-banner

Generate a leading comment banner containing the prettyprinter version number and date, the current date and time, and the personal name and email address of the user who ran the prettyprinter [default: -comment-banner].

-convert-paragraph-breaks

Convert paragraph breaks (sequences of one or more blank or empty lines) to HTML <P> tags [default: -no-convert-paragraph-breaks].

This option is intended for converting text files to HTML; it should not be used when the input file is already HTML, unless the intent really is to insert explicit paragraph tags.

This option has no effect on blank lines inside verbatim environments.

-copyright

-copyleft

Show copyright information on stderr and quit after processing any remaining command-line options that precede the next command-line HTML file.

-email-address user@hostname

Supply an alternate electronic mail address to be used in the comment banner and the LINK tag. Otherwise, a default address is constructed from the current user name and host name, when that information is available from the operating system.

This option may also be spelled -e-mail-address, since both e-mail and email are about equally common abbreviations on the World-Wide Web.

-extend-style 'style: TAG TAG ...'

-extend-style 'TAG: TAG TAG ...'

These options, whose name can be abbreviated to just -e, provide a quick command-line alternative to the style file feature of the -stylefile option. They permit you to augment the list of tags associated with a particular formatting style class, or to add a list of tags to the same class as an existing tag (to avoid the need to know the style class names), but only for the duration of the current process.

Since a tag can belong to only a single style class at one time, specification of existing tags with these options effectively removes their old style class association.

html-pretty distinguishes between the two cases by looking up the first name in its tag table. If it is entirely in uppercase, then it might be found there. Otherwise, it is a style class name.

If the style class is not recognized, an error message is issued with a list of the recognized class names, and execution is terminated after processing any remaining command-line options that precede the next command-line HTML file. Thus, an option -extend-style foo:bar' or -extend-style help=me' with obviously incorrect class and tag names can be used to coax a list of the valid ones from html-pretty; it will respond with something like this:

:1:unknown style [foo] command [BAR] at line 0
in style source [command-line]

The recognized style classes are:
        body doctype font head html
        line-break link list list-header
        list-item markup-declaration math
        math-pair pair paragraph plaintext
        public section short standalone
        standalone-nocheck title verbatim

The output line width in this list is governed by the -width option.

You can assign tags to the special style class name default to make them unknown (and also, warned about, if you supply the -unknown-tag-warning option), e.g., default:blink.

Because some operating systems do not allow embedded spaces in command-line arguments, you can separate the tags in the value string with a comma instead of whitespace. Also, you can use an equals sign instead of a colon after the first word. For example, the value strings

font:A,MEDIUM
font=A,MEDIUM
'font   =   A    MEDIUM'
'font :     A,,,,MEDIUM'

all assign tags A and MEDIUM to the font style class. Incidentally, some people prefer A (anchor) tag pairs to be treated this way, rather than their default of being placed on separate lines, even though long hypertext references in <A HREF="..."> make this style hard to read.

For more information on this topic, see the STYLE FILES section below.

-file filename

Supply an alternate input filename for use in the output comment banner. This overrides the actual filename(s), and provides a way to name the output, even when no named input file is available, because standard input is redirected, or comes from a pipe.

-grammar-level grammar

Select a grammar level that in turn will select a suitable style file. See the CATALOG DIRECTORY and STYLE FILES sections for more details.

Lettercase is not significant in grammar-level names.

You can use this option, together with the -unknown-tag-warning option, to help detect use of HTML tags that are not part of the grammar. However, a better, and grammatically rigorous, way to do this is to use a validating SGML parser, such as those accessible via the html-check(1) and html-ncheck(1) scripts.

The built-in default grammar level is a union of the HTML grammar levels 1.0, 2.0, 3.0, 3.2, and proposed 4.0, plus selected browser-vendor extensions.

A properly-installed html-pretty always supports at least grammar levels 1.0, 2.0, 3.0, 3.2, 4.0, plus one named all which contains a union of all of the grammar styles in the catalog directory. This will be similar to the built-in default style, but may differ from it in minor details, since it is easier to augment the style file than it is to rebuild and reinstall the software.

There is also a `grammar level' named dtd; it augments the normal style rules with additional ones for extra declarations and tags found in SGML and HTML Document Type Definition (DTD) files.

If you specify an incorrect grammar level, html-pretty will display a message showing the available levels, and then quit:

html-pretty -g foo
:1:could not find grammar level [foo] in
catalog file [/usr/local/share/lib/html-pretty/catalog]:
    levels available:  all dtd 2 2.0 3 3.0 3.2 4 4.0

If the style file has a public style class entry, the output DOCTYPE declaration will use its value, instead of a built-in default; see the STYLE FILES section below for details.

Tip: if you want to find out the differences in the sets of HTML tags accepted by two different grammar versions, use this option with the -print-stylefile and -width options like this:

html-pretty -w 0 -g 3.2 -p >foo.3.2
html-pretty -w 0 -g 4.0 -p >foo.4.0
diff foo.3.2 foo.4.0

The difference listing will show just those style classes where the tag lists differ, since each class will be complete on one (possibly long) line.

-help or -?

Display brief usage information on stderr and quit after processing any remaining command-line options that precede the next command-line HTML file.

-indent nnn

Set the number of spaces for each indentation level [default: 4].

-keep-format

Treat the input stream as verbatim text whose visual format is to be exactly preserved. It will be converted to the body of a preformatted HTML environment, <PRE> ... </PRE>, and all characters will be output unchanged, except for the four characters `<', `>', `&', and `"', which will be translated to SGML entities `<', `>', `&', and either `"' or `&34;' (depending on the grammar level), respectively.

This option is useful for converting non-HTML text to HTML when the linebreaking and indentation are already decided, such as preformatted tables, samples of computer program input and output, and programming language fragments.

Horizontal tab characters are treated like spaces in HTML, but most Web browsers will display text containing tabs as if each tab caused blank fill up to, and including, the next column which is a multiple of eight, which is the conventional behavior on many systems. The visual appearance is correct if the text is displayed with a fixed-width font, but is usually wrong with a proportionally-spaced font.

However, when asked to save the file as plain text, some browsers will save text with tabs, and others will save text with tabs expanded to spaces, thereby preserving the original appearance. When the same text is cut with the mouse and pasted into another window, some browsers preserve tabs, and others convert it to spaces. And even more confusingly, a browser that converts tabs to spaces in a saved file may leave tabs intact in cut-and-pasted text!

In most cases, tabs are not significant, but they are for UNIX Makefiles, so if your text has significant tabs, you should incorporate a warning that they will likely be destroyed by the Web browser, and will need to be restored manually, or with a pass through a filter such as unexpand(1).

html-pretty always treats tabs as ordinary characters, but if you want to be sure of correct display of tabs according to the conventional blank-fill rule, you should filter the input to html-pretty using a tab expander such as expand(1).

For Fortran code containing tab characters, the meaning of tabs is context sensitive, and you need to use a Fortran-tab aware converter. There is no standard UNIX utility to do this, although a local emacs(1) editor function M-x detab-fortran will do the job.

-logfile filename

Redirect warning and error messages from stderr to the indicated filename. This option is provided for user convenience on poorly-designed operating systems (e.g., IBM PC DOS) that fail to provide for redirection of stderr to a specified file.

This option can also be used for discarding messages, with, e.g., on UNIX systems, -logfile /dev/null.

If the file cannot be opened for output, html-pretty will terminate silently with a non-zero exit code, because the internal attempted redirection required the closure of stderr, making it unavailable for printing further error messages.

-no-blank-line-warning

Override an earlier -blank-line-warning option. Unlike most word processing and typesetting systems, and ordinary typewritten text, in HTML and SGML, blank lines do not imply a paragraph break, except in verbatim environments. html-pretty will therefore normally silently reduce such blank lines to a single space [default: -no-blank-line-warning].

If you specify the -convert-paragraph-breaks option, then this option will have no effect.

-no-brief

Override an earlier -brief option, so that the standard wrapper !DOCTYPE, HTML, HEAD, and BODY tags are generated [default: -no-brief].

-no-check-tag-nesting

Suppress checks for correct tag nesting [default: -no-check-tag-nesting].

-no-comment-banner

Suppress generation of the default leading comment banner.

This option can be abbreviated to -n, since it may be used frequently [default: -comment-banner].

-no-convert-paragraph-breaks

Do not convert paragraph breaks (sequences of one or more blank or empty lines) to HTML <P> tags [default: -no-convert-paragraph-breaks]. Instead, unless the -blank-line-warning option has been given, the empty lines will be reduced to a single space.

-no-keep-format

Override an earlier -keep-format option, and thus permit reformatting of the input stream [default: -no-keep-format

-no-print-stylefile

Override an earlier -print-stylefile option, and therefore, do not terminate execution before processing style files [default: -no-print-stylefile].

-no-quiet

Override an earlier -quiet option, to restore output of warning messages on stderr [default: -no-quiet].

-no-read-stylefiles

Suppress reading of the three default style files (see the STYLE FILES section below) [default: -read-stylefiles].

Style files implicitly specified with -grammar-level options, or explicitly specified with -stylefile options, will still be processed.

-no-trace-opens

Override an earlier -trace-opens option, so that file opening attempts are not traced on stderr [default: -no-trace-opens].

-no-unknown-tag-warning

Override an earlier -unknown-tag-warning option, so that unknown HTML tags will not elicit a warning message [default: -no-unknown-tag-warning].

-no-warnings-in-comments

Override an earlier -warnings-in-comments option, so that warning and error messages are written only to stderr (which might have been redirected with the -logfile option) [default: -no-warnings-in-comments].

-outfile filename

Redirect output from stdout to the indicated filename. This option is provided for user convenience on operating systems that fail to provide for redirection of stdout to a specified file.

-personal-name name

Supply an alternate personal name string to be used in the comment banner. Normally, the personal name is determined via the current user name, when that information is available from the operating system.

-print-stylefile

Print a style file on stdout and quit after processing any remaining command-line options that precede the next command-line HTML file.

This option can be abbreviated to -p.

The output line width in the style file is governed by the -width option described later.

Startup option files, and any preceding -extend-style, -grammar-level, and -stylefile options will already have been processed, and their changes will be reflected in the output produced by this option.

Use this option to find out how each recognized tag is processed, and also to get a template style file that you can customize. The style class names and tag lists are ordered alphabetically for improved readability. For more details, see the STYLE FILES section below.

-quiet

Suppress output of warning messages on stderr. Error messages will, however, not be suppressed.

If you really want to discard error messages too, then use the -logfile /dev/null option instead.

-read-stylefiles

Override an earlier -no-read-stylefiles option, so that the three standard style files (see the STYLE FILES section below) are read [default: -read-stylefiles].

-stylefile filename

Name a style file (see the STYLE FILES section below) to augment html-pretty's built-in knowledge of HTML tags. This option may be given more than once, if multiple initialization files are needed.

-trace-opens

Trace all file-opening attempts, with a one-line message on stderr for each [default: -no-trace-opens].

This option can be helpful in analyzing file access failures, and in uncovering the location of default style files and the catalog directory.

-unknown-tag-warning

Unknown HTML tags are always treated as ordinary text, but with this option, you can request that a warning be raised for each of them [default: -no-unknown-tag-warning].

While this option is not a substitute for a validating HTML parser, such as html-check(1) or html-ncheck(1), it does provide a simple way to catch use of non-standard tags. See the -extend-style option for a way to catch specific tags.

-version

Show version information on stderr and quit after processing any remaining command-line options that precede the next command-line HTML file.

-warnings-in-comments

Write warning and error messages as easily-identifiable single-line HTML comments on stdout [default: -no-warnings-in-comments]. This may facilitate correction of problems in the output file, since the messages can be located by context, rather than by line number in the input file.

Messages are always written to stderr as well (which might have been redirected with the -logfile option, or suppressed with the -quiet option).

-width nnn

Set the maximum output line width [default: 72]. This limit may be exceeded if an excessively long string without embedded spaces is encountered, and it is ignored completely inside preformatted or verbatim text.

If you set the line width to 0, it will be treated as `infinite' (i.e., the largest representable integer). If you set it to a very small value, e.g., 1, then the output will have one word per line, which might actually be useful on occasion!

This option can be abbreviated to -w.

.-3[NAME] .-2[SYNOPSIS] .-1[DESCRIPTION]
Top
.+1[FORMATTING CONVENTIONS] .+2[STYLE FILES] .+3[CATALOG DIRECTORY]