Data Compression and Archiving Using Python

bzip2 compression
bzip2 is a freely available, patent free (see below), high-quality data compressor. It typically compresses files to within 10% to 15% of the best available techniques (the PPM family of statistical compressors), whilst being around twice as fast at compression and six times faster at decompression.
Compressing a file:
import bz2 import fileinput output = bz2.BZ2File('a.txt.bz2', 'wb') for line in fileinput.input('a.txt'): output.write(line) output.close()
This will compess a.txt to a.txt.bz2.
Decompress file.
import bz2 input_file = bz2.BZ2File('a.txt.bz2', 'rb') try: print input_file.read() finally: input_file.close()
gzip compression
gzip (GNU zip) is a compression utility designed to be a replacement for compress. Its main advantages over compress are much better compression and freedom from patented algorithms.
Compress file using gzip
import gzip import fileinput output = gzip.open('a.txt.gz', 'wb') for line in fileinput.input('a.txt'): output.write(line) output.close()
Decompress the file.
import gzip input_file = gzip.open('a.txt.gz', 'rb') try: print input_file.read() finally: input_file.close()
Tar archive access
List the contents of a tar file.
import tarfile tar = tarfile.open("sample.tar", "r") for tarinfo in tar: print tarinfo.name, "is", tarinfo.size, "bytes in size and is", if tarinfo.isreg(): print "a regular file." elif tarinfo.isdir(): print "a directory." else: print "something else." tar.close()
Untar an archive file.
import tarfile tar = tarfile.open("sample.tar") tar.extractall() tar.close()
Create an archive file
import tarfile import os, fnmatch tar = tarfile.open("sample.tar", "w") files = os.listdir('.') for file in files: if os.path.isdir(file): print file,' is a dir.' if fnmatch.fnmatch ( file, '*.txt' ): print file tar.add(file) tar.close()
Using with gzip and bz2
You can use tar with gzip or bz2
use
tar = tarfile.open("sample.tar.gz", "r:gz")
Or
tar = tarfile.open("sample.tar.bz2", "r:bz2")
to work with gzip or bz2 file.
Recent Comments