If you are working with a huge text file with millions of lines and want to know the content of a specific line, there are a few tricks you can play to get the line you want. The naive approach of reading the file line by line (say in python) and keeping a count on the lines is very slow, time consuming, and not convenient.
That is where the good old “AWK” (or SED) comes in. AWK is created to deal with text files before the Perl or Python scripting era. The name AWK is after the first letters AWK’s creators Alfred Aho, Peter Weinberger, and Brian Kernighan.
Extracting any specific line of interest is just a line of code using AWK from the terminal, if you know the line number or a specific pattern unique to the line.
Let us say you want the line “lineNumber=23482364” from a huge text file “myHugeTextFile.txt”, the awk script to get the line and display on the terminal is
awk ‘NR == lineNumber’ myHugeTextFile.txt
Here, NR is short for “Number of Records” and refers to the line number of a file. As you can easily see, the above awk script can be modified to do more things.
For example, if you wanted to get all lines between a range of line numbers, the awk script can be modified by adding more conditions as
awk ‘NR >= lineNumber1 && NR <= lineNumber2' myHugeTextFile.txt
If you are interested in learning more of awk, check out the e-book by catonmat.net and the blog post series on awk.