Faster bash writing to file

I am reading a file into bash, taking values ​​and storing them in another file. There are ~ 100k lines in the file and it takes about 25 minutes to read and overwrite them.

There is perhaps some faster way to write to a file, because now I am just iterating over the lines, parsing some values, and saving them like this:

while read line; do
   zip="$(echo "$line" | cut -c 1-8)"
   echo $zip
done < file_one.txt

      

Everything works fine, the values ​​are processed correctly, I just want to know how I can optimize the process (if I even can).

thank

+3


source to share


3 answers


The bash loop only slows it down (especially the part where you call the external program ( cut

) once per iteration). You can do it all in one cut

:



cut -c 1-8 file_one.xt

      

+4


source


Calling cut

once for each line is a big bottle neck. Use substring expansion instead to capture the first 8 characters of each string.



while read line; do
   zip=${line:0:8}
   echo $zip
done < file_one.txt

      

+2


source


If you want to act on a substring of a string if it meets some condition, Awk is designed to manipulate text files:

awk '{zip=substr($0, 1, 8)} zip == "my match" {print zip}' file_one.txt

      

This example substr($0, 1, 8)

displays characters 1 through 8 for each line entry ( $0

) file_one.txt

. These substrings are assigned to a variable zip

and only printed when text is matched "my match"

.

If you are not familiar with Awk and usually have large files to manipulate, I recommend taking some time to learn it. Awk loads faster and more efficiently than bash. The blog post - Awk in 20 minutes is a good, quick introduction.

To save more time on large files, you can use a speed-optimized version of Awk called Mawk .

+1


source







All Articles