Awk: How do you create a string variable with newlines?

I am trying to do the following:

result ls -l

in folder:

-rw-rw-r--   1 root  root  100  May 23 09:45 filename1
-rw-rw-r--   1 root  root  200  May 23 09:45 filename2
-rw-rw-r--   1 root  root  500  May 23 09:46 filename3

      

Now I want to pipe this through awk to do the following:

800 bytes, files:
filename1
filename2
filename3

      

So far, I can get awk to add bytes:

output=`ls -l /some/folder/ | awk 'START {total = 0}; {total += $5} END{print total}'`

      

It's simple: 800

Now I want to start creating an output string, so I am trying to get a list of filenames (column $ 9 I think), I try like this:

output=`ls -l /some/folder/ | awk 'START {total = 0; files=""}; {total += $5 files="\n" files $9} END{print total "files:" files}'`

      

echo $output

gives the following:

800 filename1 filename2 filename3

I want it to show:

800
filename1
filename2
filename3

      

I don't understand why the lines are not breaking on new lines?

+3


source to share


3 answers


White space, including newlines, is dumped when you don't specify your variables in the shell, so a simple fix for what you did would use echo "$output"

.

However, I would recommend not using it ls -l

to get file names and sizes, since the tool is not designed for analysis. Any column based approach will break if you have an interesting filename.

Using GNU stat

allows you to get file sizes and control the output using null bytes \0

to make names parsing safe:

stat --printf '%s\0%n\0' * | awk -v RS='\0' '
NR % 2 { total += $0; next } # add to total on odd lines, skip to next line
{ files[++n] = $0 }          # save file names on other (even) lines
END { print total, "bytes, files:"; for (i = 1; i <= n; ++i) print files[i] }'

      

If you can't use stat --printf

, you can use stat -c

and hope no one puts a newline in the filename:



stat -c '%s %n' * | awk '{ total += $1; files[NR] = substr($0, length($1) + 2) } 
END { print total, "bytes, files:"; for (i = 1; i <= NR; ++i) print files[i] }'

      

The first field contains the name and the rest of the line is the file name, so it is substr

used to get that part.

*

passed as an argument stat

is expanded by the shell to the complete list of files in the current directory. You can get files in a different directory by uploading /path/to/dir/*

, or first cd

ing to the destination. You can also use a loop, for example:

for dir in dir1 dir2 dir3; do
    ( cd "$dir" && stat -c '%s %n' * | awk '...')
done

      

Here I used ( subshell )

as a lazy way to go back to the original directory after each iteration of the loop.

+2


source


ls -l | awk 'NR > 1 {s+=$5; f=f"\n"$NF} END{print s,f}'

      

The first line in the output is ls -l

ignored ( NR > 1

). The field 5th

(file size) on all lines is added to the variable s

. The file names are appended to the variable f

(separated by a newline character). In the block, END

type s

and f

.

Example:



AMD$ ls -l
total 12
-rw-r--r-- 1 root root 165 May 24 08:23 ff
-rw-r--r-- 1 root root 165 May 24 08:23 gg
-rw-r--r-- 1 root root 165 May 24 08:23 hh

AMD$ ls -l | awk 'NR > 1 {s+=$5; f=f"\n"$NF} END{print s,f}'
495
ff
gg
hh

      

If you want to store this in a variable and print it later:

var=$(ls -l | awk 'NR > 1 {s+=$5; f=f"\n"$NF} END{print s,f}')
echo "$var"

      

+2


source


To keep the structure of a variable, it must be double.

Example:

Multi-line variable:

x='hey
> there'

      

Without quoting:

echo $x
hey there

      

Double quotes:

echo "$x"
hey
there

      

+1


source







All Articles