Archiving, compressing, and extracting files are some of the most common tasks for a Linux administrator. If you've ever worked with "tarball" files that have the .tar, .tar.gz, .xz, or .bz2 extension, there's a good chance it was created using the .tar utility.
In this article, we'll demonstrate how you can use the tar
utility to archive, compress, and extract files on Linux systems. We'll use Ubuntu 20.04 for all examples, but you can follow along on any Linux system that uses tar
.
What is tar?
tar
— short for "tape archive" — is a GNU command line tool for creating and extracting archives.
An archive is a single file that includes multiple files or directories. In opensource and Linux communities, tarballs are one of the most common methods for distributing source code and other important files.
In addition to creating archives, tar
can perform compression and decompression using several different compression utilities such as gzip
and bzip2
.
tar vs. gzip
When dealing with Linux archives, you're likely to hear about tar
and gzip
, often in similar contexts.
The basic difference between the two tools is:
tar
creates archives from multiple files while gzip
compresses files.
However, these tools are not mutually exclusive. tar
can use gzip
to compress the files it archives. tar
's z
switch makes the tar
command use gzip
.
gzip vs. bzip2 vs. xz Compression
gzip
isn't the only compression program tar
can use. It also supports bzip2
and xz
. The table below details some of the basic differences between these compression tools.
gzip | bzip2 | xz | |
---|---|---|---|
Compression algorithim | DEFLATE | Burrows–Wheeler | LZMA |
Common file extensions | .tar.gz, .tgz, .gz | tar.bz2, .bz2 | tar.xz, .xz |
tar command switch |
-z |
-j |
-J |
Generally, gzip
and bzip2
are comparable from a compression and performance perspective, but gzip
is more widely used. .xz
tends to give the best overall compression but also takes more time and Linux desktop computer resources.
Note: In our examples, we'll focus on using gzip
. Replacing -z
with -j
in the commands will use bzip2
instead of gzip. Using -J
instead of -z
will use xz
instead of gzip
.
How to Compress a Single File or Directory
The general command to compress a single file or directory in Linux is:
tar -czvf <archive name> </path/to/file/or/directory>
Here is what each of those switches means:
-
c
- Create an archive. -
z
- Run the archive throughgizp
. -
v
- Verbosely list files. -
f
- Use a specific archive file.
For example, to compress the /pepper
directory to an archive named egg.tar.gz
, run this command:
tar -czvf egg.tar.gz /pepper
The output will look similar to:
tar: Removing leading `/' from member names
/pepper/
/pepper/pepperAndegg.log
/pepper/pepperAndEgg.txt
/pepper/pepperandegg.log
If we omitted the v
switch and instead used the command tar -czf egg.tar.gz /pepper
, the output would not include each file. Instead, it would look similar to this
tar: Removing leading `/' from member names
And if there were no errors or characters that needed to be removed from the member names — for example, if we were compressing files in our current working directory — there would be no output.
Note: There's more than one way to specify tar
switches. You'll notice that we're using -
before specifying our tar
switches. While that is a common convention, it is not generally required. tar czvf <archive name> </path/to/file/or/directory>
would work too. As would tar -cf <archive name> -vz </path/to/file/or/directory>
. We'll stick to the convention we used here in the rest of our examples, but keep in mind there is more than one way to specify tar
options.
How to Compress Multiple Files or Directories to a Single Archive
The general command to compress a single file or directory in Linux is:
tar -czvf <archive name> </path/to/file/or/directory1> </path/to/file/or/directory2> ... </path/to/file/or/directoryN>
For example, to compress the files one.txt
, two.mp4
, and three.iso
to an archive named egg.tar.gz
, run this command:
tar -czvf egg.tar.gz one.txt two.mp4 three.iso
How to Exclude Directories and Files when Archiving
If you specify a directory to create an archive, there may be some files you want to exclude from the archive. The --exclude
option lets you specify patterns to exclude from your archive.
Any file that matches the patterns passed to the --exclude
option will NOT be included in the archive tar
creates.
The general command to exclude files from a tar
archive is:
tar --exclude=<PATTERN> <Options> <archive name> </path/to/directory>
For example, suppose we had these files in our /pepper
directory:
- one.txt
- two.mp4
- three.iso
- four.log
- output.log
And we want to compress everything except the .log files to an egg.tar.gz
archive. We can use this command:
tar --exclude='*.log' -czvf egg.tar.gz /pepper
If needed, you can specify multiple --exclude
patterns in a single command.
How to Add Files to an Existing Archive
If you have an existing archive and you want to add files to it, you can use the -r
or --append
options. A general command to append to .tar archives is:
tar -rf <tar archive> </path/to/file>
However, -r
and --append
are incompatible with compressed archives. That means you can only use them with tarballs you have not run through compression programs like gzip
, bzip2
, or xz
.
If you attempt to use -r
or --append
on a compressed archive, you may see an error similar to:
tar: Cannot update compressed archives
tar: Error is not recoverable: exiting now
Because of this limitation and some of the other nuances of -r
and --append
, in many cases it's easier to create a new archive with the additional files.
How to List the Contents of an Archive
You can list the contents of an archive using the -t
or --list
options. The general command to list the contents of an archive is:
tar -tvf <archive>
The -t
and --list
options work on compressed and uncompressed archives.
For example, to list the contents of an egg.tar.xz
archive in your current working directory, run this command:
tar -tvf egg.tar.xz
How to Extract an Archive
tar
's -x
switch is for extracting archives. The general command to extract an archive in Linux is:
tar -xf <archive>
The tar -xf
command works with both compressed and uncompressed archives.
For example, to extract a egg.tar.gz
archive in our current working directory, we can use this command:
tar -xf egg.tar.gz
How to Extract an Archive to a Specific Directory
In some cases, you may want to extract files to a directory other than your current working directory. tar
's -C
switch is useful in this case.
The general command to extract an archive to a specific directory is:
tar -xf <archive> -C </path/to/destination>
For example, to extract our egg.tar.gz
archive to /tmp/cherry
, we can use this command:
tar -xf egg.tar.gz -C /tmp/cherry
Conclusion
Now that you know the basics of working with tar
, you can work with "tarballs" like a pro. Keep in mind, tar
is flexible, and you can combine different switches to produce different results and tweak output. To take a deeper dive on tar
, check out the offical GNU tar manual