Performs a count of file types in all directories
I have a bash script that gives me the number of files in all directories, recursively, that have been edited in the last 45 days
find . -type f -mtime -45| rev | cut -d . -f1 | rev | sort | uniq -ic | sort -rn
I have a directory called
\parent
and in the parent I:
\parent\a
\parent\b
\parent\c
I would execute the above script once in the folder a
, once on, b
and once on c
.
Current output:
91 xls
85 xlsx
49 doc
46 db
31 docx
24 jpg
22 pub
10 pdf
4 msg
2 xml
2 txt
1 zip
1 thmx
1 htm
1 /ic
I would like to run a script from \parent
in all folders inside \parent
and get output like this:
+-------+------+--------+
| count | ext | folder |
+-------+------+--------+
| 91 | xls | a |
| 85 | xlsx | a |
| 49 | doc | a |
| 46 | db | a |
| 31 | docx | a |
| 24 | jpg | a |
| 22 | pub | a |
| 10 | pdf | a |
| 4 | msg | a |
| 98 | jpg | b |
| 92 | pub | b |
| 62 | pdf | b |
| 2 | xml | b |
| 2 | txt | b |
| 1 | zip | b |
| 1 | thmx | b |
| 1 | htm | b |
| 1 | /ic | b |
| 66 | txt | c |
| 48 | msg | c |
| 44 | xml | c |
| 30 | zip | c |
| 12 | doc | c |
| 6 | db | c |
| 6 | docx | c |
| 3 | jpg | c |
+-------+------+--------+
How can I accomplish this using bash?
source to share
Put it in a script, make it executable: chmod +x script.sh
and run it with./script.sh
#!/bin/sh
find . -type f -mtime -45 2>/dev/null \
| sed 's|^\./\([^/]*\)/|\1/|; s|/.*/|/|; s|/.*.\.| |p; d' \
| sort | uniq -ic \
| sort -b -k2,2 -k1,1rn \
| awk '
BEGIN{
sep = "+-------+------+--------+"
print sep "\n| count | ext | folder |\n" sep
}
{ printf("| %5d | %-4s | %-6s |\n", $1, $3, $2) }
END{ print sep }'
-
sed 's|^\./\([^/]*\)/|\1/|; s|/.*/|/|; s|/.*.\.| |p; d'
-
s|^\./\([^/]*\)/.*/|\1 |
replaces./a/file.xls
witha/file.xls
. -
s|/.*/|/|
replacesb/some/dir/file.mp3
withb/file.mp3
. -
s|/.*.\.| |p
replacesa file.xls
witha xls
ifs///p
successful, then it also prints to stdout (to avoid files without extension). -
d
deletes the line (to avoid printing matches (again) or mismatched lines).
-
-
sort | uniq -ic
counts each extension and directory name group. -
sort -b -k2,2 -k1,1rn
sorts first by catalog (field 2), small → large, and then by count (field 1) in reverse order (large → small) and numerically.-b
forces you tosort(1)
ignore spaces (spaces / tabs). -
the last part of awk is pretty printing the output, maybe you want to put that in a separate script.
If you want to see how each filter filters the results, just try deleting them and you will see the result.
You can find good tutorials here about sh / awk / sed etc.
source to share