Awk: How do you create a string variable with newlines?
I am trying to do the following:
result ls -l
in folder:
-rw-rw-r-- 1 root root 100 May 23 09:45 filename1
-rw-rw-r-- 1 root root 200 May 23 09:45 filename2
-rw-rw-r-- 1 root root 500 May 23 09:46 filename3
Now I want to pipe this through awk to do the following:
800 bytes, files:
filename1
filename2
filename3
So far, I can get awk to add bytes:
output=`ls -l /some/folder/ | awk 'START {total = 0}; {total += $5} END{print total}'`
It's simple: 800
Now I want to start creating an output string, so I am trying to get a list of filenames (column $ 9 I think), I try like this:
output=`ls -l /some/folder/ | awk 'START {total = 0; files=""}; {total += $5 files="\n" files $9} END{print total "files:" files}'`
echo $output
gives the following:
800 filename1 filename2 filename3
I want it to show:
800 filename1 filename2 filename3
I don't understand why the lines are not breaking on new lines?
source to share
White space, including newlines, is dumped when you don't specify your variables in the shell, so a simple fix for what you did would use echo "$output"
.
However, I would recommend not using it ls -l
to get file names and sizes, since the tool is not designed for analysis. Any column based approach will break if you have an interesting filename.
Using GNU stat
allows you to get file sizes and control the output using null bytes \0
to make names parsing safe:
stat --printf '%s\0%n\0' * | awk -v RS='\0' '
NR % 2 { total += $0; next } # add to total on odd lines, skip to next line
{ files[++n] = $0 } # save file names on other (even) lines
END { print total, "bytes, files:"; for (i = 1; i <= n; ++i) print files[i] }'
If you can't use stat --printf
, you can use stat -c
and hope no one puts a newline in the filename:
stat -c '%s %n' * | awk '{ total += $1; files[NR] = substr($0, length($1) + 2) }
END { print total, "bytes, files:"; for (i = 1; i <= NR; ++i) print files[i] }'
The first field contains the name and the rest of the line is the file name, so it is substr
used to get that part.
*
passed as an argument stat
is expanded by the shell to the complete list of files in the current directory. You can get files in a different directory by uploading /path/to/dir/*
, or first cd
ing to the destination. You can also use a loop, for example:
for dir in dir1 dir2 dir3; do
( cd "$dir" && stat -c '%s %n' * | awk '...')
done
Here I used ( subshell )
as a lazy way to go back to the original directory after each iteration of the loop.
source to share
ls -l | awk 'NR > 1 {s+=$5; f=f"\n"$NF} END{print s,f}'
The first line in the output is ls -l
ignored ( NR > 1
). The field 5th
(file size) on all lines is added to the variable s
. The file names are appended to the variable f
(separated by a newline character). In the block, END
type s
and f
.
Example:
AMD$ ls -l
total 12
-rw-r--r-- 1 root root 165 May 24 08:23 ff
-rw-r--r-- 1 root root 165 May 24 08:23 gg
-rw-r--r-- 1 root root 165 May 24 08:23 hh
AMD$ ls -l | awk 'NR > 1 {s+=$5; f=f"\n"$NF} END{print s,f}'
495
ff
gg
hh
If you want to store this in a variable and print it later:
var=$(ls -l | awk 'NR > 1 {s+=$5; f=f"\n"$NF} END{print s,f}')
echo "$var"
source to share