Tuesday, October 02, 2012

File Archiving and Compressing Using tar and gzip


It is a common practice by UNIX professionals to use the tar utility for archiving, compressing and decompressing of files. Most commonly used utility for this purpose is tar utility. The tar command can be used to rip the files and directories into a archive file which is commonly called a tarball or just tar. Depending on the rate of compression there are different compressed archive utilities which rip files and directories to a single highly compressed archive, like gzip and bzip2.

The main purpose of this article is to compare these tar archives through commands used to extract, read and archive various tarballs.

tar Archive

Archive
$ tar cvf </path/to/destination/archivename.tar> </path/to/source/dir/>

To estimate the size in bytes before creating tar file
$ tar cf - </path/to/source/dir/> | wc -c

Read
$ tar tvf </path/to/destination/archivename.tar>

Extract
To extract in current directory
$ tar xvf </path/to/archivename.tar>

To extract to specific directory
$ tar xvf </path/to/archivename.tar> -C </path/to/destination/dir>

To extract a particular directory
$ tar xvf </path/to/archivename.tar> <path/to/particular/dir/to/extract/>

To extract a multiple directories
$ tar xvf </path/to/archivename.tar> <path/to/dir1> <path/to/dir2/>

To extract a particular file
$ tar xvf </path/to/archivename.tar> <file/to/be/extracted>

To extract all files with '.pl' extension
$ tar xvf </path/to/archivename.tar> --wildcards '*.pl'

Append
$ tar rvf </path/to/archivename.tar> <path/to/newfile>

$ tar rvf </path/to/archivename.tar> <path/to/new_directory>

Note: You cannot add file or directory to a compressed archive. If you try to do so, you will get “tar: Cannot update compressed archives” error as shown below.

tar.gz Compressed Archive

Archive
$ tar cvzf </path/to/destination/archivename.tar.gz> </path/to/source/dir/>
Note: the 'z' switch filters the archive through gzip and tar.gz is same as tgz

To estimate the size in bytes before creating tar.gz file
$ tar czf - </path/to/source/dir/> | wc -c

Read
$ tar tvf </path/to/destination/archivename.tar.gz>

Extract
$ tar xvf </path/to/archivename.tar.gz> ### Extract in current directory

To extract to specific directory
$ tar xvf </path/to/archivename.tar.gz> -C </path/to/destination/dir>
xvzf

tar.bz2 Compressed Archive
bzip2 takes more time to compress and decompress than gzip. bzip2 archival size is less than gzip.

Archive
$ tar cvjf </path/to/destination/archivename.tar.bz2> </path/to/source/dir/>
Note: the 'j' switch filters the archive through bzip2 and tar.bz2 is same as tbz2

To estimate the size in bytes before creating tar.bz2 file
$ tar cjf - </path/to/source/dir/> | wc -c

Read
$ tar tvf </path/to/destination/archivename.tar.bz2>

Extract
To extract in current directory
$ tar xvf </path/to/archivename.tar.bz2>

To extract to specific directory
$ tar xvf </path/to/archivename.tar.bz2> -C </path/to/destination/dir>

1 comment: