gzip file format is one of the most common formats for compressing/decompressing files. gzip compression on text files greatly reduce the space used to store the text file. If you are working with a big data file, often the big text files is compressed with gzip or “gzipped” to save space. A naive way to work with compressed gzip file is to uncompress it and work with much bigger unzipped file line by line. Clearly, that is not the best solution.
In Python, you can directly work with gzip file. All you need is the Python library gzip.
import gzip
How to read a gzip file line by line in Python?
with gzip.open('big_file.txt.gz', 'rb') as f: for line in f: print(line)
How to Create a gzip File in Python
We can also use gzip library to create gzip (compressed) file by dumping the whole text content you have
all_of_of_your_content = "all the content of a big text file" with gzip.open('file.txt.gz', 'wb') as f: f.write(all_of_your_content)
How to create gzip (compressed file) from an existing file?
We can create gzip file from plain txt file (unzipped) without reading line by line using shutil library. The shutil module offers high-level operations on files copying and deletion. We will first open the unzipped file, then open the zipped file and use shutil to copy the unzipped file object to zipped file object.
import shutil # open the unzipped file with flie handler inp_f with open("test_file.txt","rb") as inp_f: # open the output zipped file with file handler out_f with gzip.open("test_file.txt.gz","wb") as out_f: # use shutil to copy the file objec shutil.copyfileobj(inp_f,out_f)