reading and writing .tar and tar.gz files
The tarfile module enables read and write access to both plain and gzipped TAR files.
TAR is a widely spread file archive format, mostly used in the *NIX world. It does not implement
data compression itself, so in most cases a TAR archive is filtered through GZIP in order to reduce
its size. TAR files have the suffix .tar
, gzipped TAR files end with .tar.gz
or .tgz
.
There has already been another approach to TAR files by a python module written by Jason Petrone -
also called tarfile - which however had only read capabilities.
This module has been developed in accordance to the zipfile standard module. All basic methods and functions are compatible
to zipfile. This shall make it easy to add TAR file support to existing projects or to offer two archiving algorithms in a
program.
However, the TAR format is much more sophisticated than ZIP in handling different file formats, file permissions and file
ownership. Because of that, tarfile offers some additional methods. For detailed information on the TAR format, consult the GNU tar manual's
The Standard Format Section.
Please note that the tarfile module is not intended to be a replacement for a full-blown
commandline tar
program!
TarFile
Objects
for more details.TarInfo
Objects.true
if file is a valid TAR archive, otherwise false
. It looks for the magic string
in the first block.
file may be a filename or a file-like object. If file points to an empty file, is_tarfile
returns true
, too.'.gz'
is appended to file as target filename.
gzip()
. If file is not given, it tries to convert the extension
'.gz'
resp. '.tgz'
to '.tar'
.sys.stdout
. You need this, if you want to write
a gzipped TarFile
to sys.stdout
. filename is the desired filename of
the TAR file in the gzip file (e.g. "sources.tar"
). This is used due to the fact
that gzip files contain the filenames of the original files, and sys.stdout
has no proper name.
Please note that Tarfile.debug
is set to 0
(!).
sys.stdout
is set to binary mode implicitly.
tarfile = TarFile(stdout("sources.tar"), "w", TAR_GZIPPED) tarfile.debug = 0 # suppress debug messages
The module additionally defines some constants:
gzip
standard module.TAR specific type constants:
Some GNU tar special types: