tar
@UNREVISED
Options allow changing the file name of the archive, choosing which files to work upon by looking at their characteristics, or somewhat altering the file names respective to the name of archive members.
By default, tar
uses an archive file name compiled in when
tar
was built. Usually this refers to some physical tape
drive on the machine. Often, the installer of tar
didn't
set the default to anything meaningful at all. As a result, most
uses of tar
need to tell tar
where to find (or create)
the archive. The `--file=archive-name' (`-f archive-name') option selects another file to use as
the archive.
If the archive file name includes a colon (`:'), then it is assumed
to be a file on another machine. If the archive file is
`user@host:file', then file is used on the
host host. The remote host is accessed using the rsh
program, with a username of user. If the username is omitted
(along with the `@' sign), then your user name will be used.
(This is the normal rsh
behavior.) It is necessary for the
remote machine, in addition to permitting your rsh
access, to
have the `/usr/ucb/rmt' program installed. If you need to use a
file whose name includes a colon, then the remote tape drive behavior
can be inhibited by using the `--force-local' option.
If the file name you give to `--file=archive-name' (`-f archive-name') is a single dash (`-'),
then tar
will read the archive from (or write it to) standard
input (or standard output).
An archive can be saved as a file in the file system, sent through a pipe or over a network, or written to an I/O device such as a tape or disk drive. To specify the name of the archive, use the `--file=archive-name' (`-f archive-name') option.
An archive name can be the name of an ordinary file or the name of an
I/O device. tar
always needs an archive name--if you do not
specify an archive name, the archive name comes from the environment
variable TAPE
or, if that variable is not specified, a default
archive name, which is usually the name of tape unit zero (ie.
/dev/tu00).
If you use `-' as an archive-name, tar
reads the
archive from standard input (when listing or extracting files), or
writes it to standard output (when creating an archive). If you use
`-' as an archive-name when modifying an archive,
tar
reads the original archive from its standard input and
writes the entire new archive to its standard output.
@FIXME{Does standard input and output redirection work with all operations? Need example for standard input and output (screen and keyboard?)}
To specify an archive file on a device attached to a remote machine, use the following:
--file=hostname:/dev/file name
tar
will complete the remote connection, if possible, and
prompt you for a username and password. If you use
`--file=@hostname:/dev/file name', tar
will complete the remote connection, if possible, using your username
as the username on the remote machine.
@FIXME{is this clear?}
File Name arguments specify which files in the file system
tar
operates on, when creating or adding to an archive, or which
archive members tar
operates on, when reading or deleting from
an archive. See section Basic tar
Operations.
To specify file names, you can include them as the last arguments on the command line, as follows:
tar operation [option1 option2 ..] [file name-1 file name-2 ...]
If you specify a directory name as a file name argument, all the files
in that directory are operated on by tar
.
If you do not specify files when tar
is invoked, tar
operates on all the non-directory files in the working directory (if
the operation is `--create' (`-c')), all the archive members in the
archive (if a read operation is specified), or does nothing (if any
other operation is specified).
When specifying the names of files or members to tar
, it by
default takes the names of the files from the command line. There are
other ways, however, to specify file or member names, or to modify the
manner in which tar
selects the files or members upon which to
operate. In general, these methods work both for specifying the names
of files and archive members.
GNU tar
especially recognizes when the archive is being created
to `/dev/null', it tries to minimize input and output operations
in this case. The Amanda backup system, when used with GNU tar
,
has an initial sizing pass which uses this feature.
@UNREVISED
Instead of giving the names of files or archive members on the
command line, you can put the names into a file, and then use the
`--files-from=file-of-names' (`-T file-of-names') option to tar
. Give the name of the
file which contains the list as the argument to `--files-from=file-of-names' (`-T file-of-names').
The file names should be separated by newlines in the list. If you
give a single dash as a file name for `--files-from=file-of-names' (`-T file-of-names'), that is,
you specify `--files-from=-' (`-T -'), then the file names
are read from standard input.
If you want to specify names that might contain newlines, use the `--null' option. Then, the file names should be separated by NUL characters (ASCII 000) instead of newlines. In addition, the `--null' option turns off the `--directory=directory' (`-C directory') option (see section Changing Directory).
Instead of taking the list of files to work on from the command line, the list of files to work on is read from the file filename. If filename is given as `-', the list is read from standard input. Note that using both `-T -' and `-f -' will not work unless you are using the `--create' (`-c') command.
This is typically useful when you have generated the list of files to
archive with find
.
The `--null' option causes `--files-from=file-of-names' (`-T file-of-names') to read file names
terminated by a NUL instead of a newline, so files whose
names contain newlines can be archived using `--files-from=file-of-names' (`-T file-of-names').
The `--null' option is just like the one in GNU xargs
and
cpio
, and is useful with the `-print0' predicate of GNU
find
. In tar
, `--null' also causes `--directory=directory' (`-C directory')
options to be treated as file names to archive, in case there are
any files out there called `-C'.
Reading a List of File Names from a File @UNREVISED
To read file names from a file on the file system, instead of from the command line, use the `--files-from=file-of-names' (`-T file-of-names') option. If you specify `-' as file, the file names are read from standard input. Note that using both `--files-from=-' (`-T -') and `--file=-' (`-f -') in the same command will not work unless the operation is `--create' (`-c'). See section Changing the Archive Name, for an explanation of the `--file=archive-name' (`-f archive-name') option.
This option causes tar
to read a list of regular expressions (in
shell wildcard syntax), one per line, from file; tar
will
ignore files matching those regular expressions. Thus if tar
is
called as `tar -c -X foo .' and the file `foo' contains
a single line `*.o', no files whose names end in `.o' will be
added to the archive. Multiple `--exclude=pattern' options may be given.
The `--exclude=pattern' option will prevent any file or member which matches the regular expression pattern from being operated on. For example, if you want to create an archive with all the contents of `/tmp' except the file `/tmp/foo', you can use the command `tar --create --file=arch.tar --exclude=foo'.
If there are many files you want to exclude, you can use the `--exclude-from=file-of-patterns' (`-X file-of-patterns') option. This works just like the `--files-from=file-of-names' (`-T file-of-names') option: specify the name of a file as exclude-list which contains the list of patterns you want to exclude.
To avoid operating on files whose names match a particular pattern,
use the `--exclude=pattern' or `--exclude-from=file-of-patterns' (`-X file-of-patterns') options. When you
specify the `--exclude=pattern' option, tar
ignores files which
match the pattern, which can be a single file name or a more
complex expression. Thus, if you invoke tar
with `tar
--create --exclude=*.o', no files whose names end in `.o' are
included in the archive.
A pattern should be written according to shell syntax, using wildcard characters to effect globbing. Most characters in the pattern stand for themselves in a file name, and case is significant: `a' will match only `a', and not `A'. The character `?' in the pattern matches any single character in the file name. The character `*' in the pattern matches zero, one or more single characters in the file name. The character `[', up to the matching `]', introduces a character class, and is described in the next paragraph. The character `\' in a pattern merely introduces the following character of the pattern as matching a single character in the file name; it is useful when one needs to match `?', `*', `[' or `\' themselves.
A character class is a list of acceptable characters for the next single character of the file name. However, if the first character of the class, just after the opening `[', is `!' or `^', then the meaning of the class is reversed, and it rather lists those characters which are forbidden as the next single character of the file name. Other characters of the class stand for themselves. The special construction `l-m', using an hyphen between two letters, is meant to represent all characters between l and m included.
Periods (`.') or slashes (`/') are not considered special for wildcard matches. However, if a pattern completely matches a directory prefix of a file name, then it matches the full file name: that is to say that that excluding a directory also excludes all the files beneath it.
`--exclude-from=file-of-patterns' (`-X file-of-patterns') acts like `--exclude=pattern', but specifies a file
file containing a list of patterns. tar
ignores files
with names that fit any of these patterns.
You can use either option more than once in a single command.
tar
to ignore files that match the pattern.
tar
to ignore files that match the patterns listed in
file.
Even if exclude options are somewhat straightforward, a few users find them confusing. Collected out of a few reports we received, here is a list of more common pitfalls.
tar
, after all options, there
is an explicit list of files or directories to handle. If any such
file is directly subject to an exclusion due to
or , then the explicit file prevails over
the exclusion. You may consider that exclusion is effected while
directories are recursively traversed, but not at the top level.
tar
see wildcard characters like `*'.
If you do not do this, the shell might expand the `*' itself using
files at hand, so tar
might receive a list of files instead
of one pattern, or none at all, making the command somewhat illegal.
This might not correspond to what one wants. For example, write:
tar --create --file=archive.tar --exclude='*/tmp/*' directoryrather than:
tar --create --file=archive.tar --exclude=*/tmp/* directory
tar
use shell syntax, or globbing, rather
than regexp
syntax. Using regexp
syntax to describe
files to be excluded will generally not yield the expected behavior.
tar
prior to 1.11.8, long options and old
options could not be safely mixed on a single tar
invocation,
so the option was not recognized as such, for the
common case old options were used.
tar
, at a time before the current `--exclude'
option functionality existed.
This option causes tar
to only work on files whose modification
or inode-changed times are newer than the date given. The main
use is for creating an archive; then only new files are written.
If extracting, only newer files are extracted.
See section Date input formats, for what is an acceptable date. Remember that the entire date argument must be quoted if it contains any spaces.
To operate only on files with modification or status-change times after a particular date, use `--after-date=date' (`-N date'). You can use this option with `--create' (`-c') or `--append' (`-r') to insure only new files are archived, or with `--extract' (`-x') to insure only recent files are resurrected.
`--newer-mtime=date' acts like `--after-date=date' (`-N date') but tests just the modification times of the files, ignoring status-change times. @FIXME{Need example of --newer-mtime with quoted argument.}
Please Note: `--after-date=date' (`-N date') and `--newer-mtime=date' should not be used for incremental backups. Some files (such as those in renamed directories) are not selected up properly by these options. @FIXME-xref{to incremental backup chapter when node name is decided.}
The `--after-date=date' (`-N date') or `--newer=date' limits tar
to only operating on files which have been modified after the date
specified. For more information on how to specify a date, section Date input formats. A file is considered to have changed if the contents
have been modified, or if the owner, permissions, and so forth, have
been changed.
If you only want tar
make the date comparison on the basis
of the actual contents of the file's modification, then use the
`--newer-mtime=date' option.
You should never use this option for making incremental dumps. To learn
how to use tar
to make backups, section Performing Backups and Restoring Files.
For selecting only files newer than the modification time of an
already existing file, one can use the `-r' of GNU date
.
This option existed for a while now, but might not have been officially
published yet. It returns the timestamp of a file. So, one could
use things like one of:
tar --create --file=archive --newer="`date --reference=file`"... tar cf archive --newer="`date -r file`"... p@end example Or else, slightly less easily, one may use `find -newer file' to get a list of files newer than the reference file, and pipe the result totar
using the `--files-from=file-of-names' (`-T file-of-names') option. For example:find ... -depth -newer file \ | tar --create --file=archive --no-recurse --files-from=- find ... -depth -newer file | tar cfT archive - --no-recurseDescending into Directories
Usually, all directories given on the command line, or through the option, will be recursively explored for the various files they contain. This is not always convenient. One may use
find
for recursing throught directories for constructing a list of file names fortar
, without havingtar
do recursion on its own. The `! -type d' option tofind
could help, but would avoid saving empty directories. So, GNUtar
offers a `--no-recurse' option meant to inhibit its automatic descent into directories.
tar
from recursively descending directories.
The option `--no-recurse' inhibits the automatic action of
tar
, which recursively descends into specified directories.
When this option is selected, GNU tar
grabs directory entries
themselves, but not recurse on them. Many people use find
for locating files they want to back up, and since tar
usually recurses on directories, they have to use the
`! -d' option to find
as they usually do not want
all the files in a directory. They then use the
option to archive the files located via find
.
When restoring files archived in this manner, the problem is
that the directories themselves are not on the archive, so the
`--same-permissions' (`-p') option does not affect them--while users
might really like it to. So, `--no-recurse' is a way to tell
tar
to grap only the directory entries given to it, adding no
new files on its own.
This option causes tar
to not cross filesystem boundaries when
archiving parts of a directory tree. This option only affects files
that are archived because they are in a directory that is archived;
files named on the command line are archived regardless, and they
can be from various file systems.
This option is useful for making full or incremental archival backups
of a filesystem, as with the Unix dump
command.
Files skipped due to this option are mentioned on standard error.
To avoid crossing file system boundaries when archiving parts of a directory tree, use `--one-file-system' (`-l'). This option only affects files that are archived because they are in a directory that is being archived; files explicitly named on the command line are archived regardless of where they reside. This option is useful for making full or incremental archival backups of a file system. If this option is used in conjunction with `--verbose' (`-v'), files that are excluded are mentioned by name on the standard error.
tar
from crossing file system boundaries when
archiving. Use in conjunction with any write operation.
The `--one-file-system' (`-l') option causes tar
to modify its
normal behavior in archiving the contents of directories. If a file in
a directory is not on the same filesystem as the directory itself
(because it is a mounted filesystem in its own right), then tar
will not archive that file, or (if it is a directory itself) anything
beneath it.
This does not necessarily limit tar
to only archiving the
contents of a single filesystem, because all files named on the
command line, or through the `--files-from=file-of-names' (`-T file-of-names') option, will always
be archived.
@UNREVISED
This option causes tar
to change into the directory dir
before continuing. This option can be interspersed with the files
tar
is to work on. For example,
tar -c iggy ziggy -C baz melvin
will place the files `iggy' and `ziggy' from the current directory on the tape, followed by the file `melvin' from the directory `baz'. This option is especially useful when you have several widely separated files that you want to store in the same directory in the archive.
Here, the file `melvin' is recorded in the archive under the precise name `melvin', not `baz/melvin'. Thus, the archive will contain three files that all appear to have come from the same directory; if the archive is extracted with plain `tar -x', all three files will be created in the current directory.
Contrast this with the command:
tar -c iggy ziggy bar/melvin
which records the third file in the archive under the name `bar/melvin' so that, if plain `tar -x' is used, the third file will be created in a subdirectory named `bar'.
Suppose that, without changing your current directory, you want
to call tar
to dump files from `/users/ctd/dipp' say.
Then `--directory=directory' (`-C directory') is for you. You could do things like:
tar cfC archive.tar /users/ctd/dipp .
(the `.' means the current directory, once the `--directory=directory' (`-C directory') obeyed).
Some people might want some option to extract everything from an
archive in the current directory, ignore directory structure in
the archive. This is so rarely proper that I doubt such an option
would be really useful. It would only help getting around improper
tar
usage, it might even encourage improper usage. In general,
`--directory=directory' (`-C directory') might be used to produce archives with a cleaner
structure in the first place.
The `--directory=directory' (`-C directory') option causes tar
to change its current
working directory to directory. Unlike most options, this
one is processed at the point it occurs within the list of files to
be processed. Consider the following command:
tar --create --file=foo.tar -C /etc passwd hosts -C /lib libc.a
This command will place the files `/etc/passwd', `/etc/hosts', and `/lib/libc.a' into the archive. However, the names of the archive members will be exactly what they were on the command line: `passwd', `hosts', and `libc.a'. The `--directory=directory' (`-C directory') option is frequently used to make the archive independent of the original name of the directory holding the files.
Note that `--directory=directory' (`-C directory') options are interpreted consecutively.
If `--directory=directory' (`-C directory') option specifies a relative file name, it is
interpreted relative to the then current directory, which might not
be the same as the original current working directory of tar
,
due to a previous `--directory=directory' (`-C directory') option.
When using `--files-from=file-of-names' (`-T file-of-names') (see section Reading Names from a File), you can put `-C' options in the file list. Unfortunately, you cannot put `--directory' options in the file list. (This interpretation can be disabled by using the `--null' option.)
Changing the Working Directory Within a List of File Names @UNREVISED
To change working directory in the middle of a list of file names, either on the command line or in a file specified using `--files-from=file-of-names' (`-T file-of-names'), use `--directory=directory' (`-C directory'). This will change the working directory to the directory directory after that point in the list. For example,
tar --create iggy ziggy --directory=baz melvin
will place the files `iggy' and `ziggy' from the current directory into the archive, followed by the file `melvin' from the directory `baz'. This option is especially useful when you have several widely separated files that you want to store in the same directory in the archive.
Note that the file `melvin' is recorded in the archive under the precise name `melvin', not `baz/melvin'. Thus, the archive will contain three files that all appear to have come from the same directory; if the archive is extracted with plain `tar --extract', all three files will be written in the current directory.
Contrast this with the command
tar -c iggy ziggy bar/melvin
which records the third file in the archive under the name `bar/melvin' so that, if the archive is extracted using `tar --extract', the third file will be written in a subdirectory named `bar'.
@FIXME{Need to test how extract deals with this, and add an example.}
@UNREVISED
By default, GNU tar
drops a leading `/' on input or output.
This option turns off this behavior; it's equivalent to changing to the
root directory before running tar
(except it also turns off the
usual warning message).
When tar
extracts archive members from an archive, it strips any
leading slashes (`/') from the member name. This causes absolute
member names in the archive to be treated as relative file names. This
allows you to have such members extracted wherever you want, instead of
being restricted to extracting the member in the exact directory named
in the archive. For example, if the archive member has the name
`/etc/passwd', tar
will extract it as if the name were
really `etc/passwd'.
Other tar
programs do not do this. As a result, if you create an
archive whose member names start with a slash, they will be difficult
for other people with an inferior tar
program to use. Therefore,
GNU tar
also strips leading slashes from member names when
putting members into the archive. For example, if you ask tar
to
add the file `/bin/ls' to an archive, it will do so, but the member
name will be `bin/ls'.
If you use the `--absolute-names' (`-P') option, tar
will do
neither of these transformations.
@FIXME{Is this what this does, or does it just preserve the slash?}
To archive or extract files relative to the root directory, specify the `--absolute-names' (`-P') option.
Normally, tar
acts on files relative to the working
directory--ignoring superior directory names when archiving, and
ignoring leading slashes when extracting.
When you specify `--absolute-names' (`-P'), tar
stores file names
including all superior directory names, and preserves leading slashes.
If you only invoked tar
from the root directory you would never
need the `--absolute-names' (`-P') option, but using this option may be
more convenient than switching to root.
@FIXME{Should be an example in the tutorial/wizardry section using this to transfer files between systems.}
@FIXME{Is write access an issue?}
tar
prints out a message about removing the `/' from file
names. This message should appear once per GNU tar
invocation.
It represents something which ought to be told, ignoring what it means
can cause very serious surprises, later (as some angry users told us).
Some people, nevertheless, do not want to see this message. Wanting to
play really dangerously, one may of course redirect tar
standard
error to the sink. For example, under sh
:
tar cf archive.tar /home 2> /dev/null
Another solution, both nicer and simpler, would be to change to the `/' directory first, and then avoid absolute notation. For example:
(cd /; tar cf archive.tar home) tar cfC archive.tar / home