Performs a count of file types in all directories

I have a bash script that gives me the number of files in all directories, recursively, that have been edited in the last 45 days

 find . -type f -mtime -45| rev | cut -d . -f1 | rev | sort | uniq -ic | sort -rn

      

I have a directory called

\parent

      

and in the parent I:

\parent\a
\parent\b
\parent\c

      

I would execute the above script once in the folder a

, once on, b

and once on c

.

Current output:

     91 xls
     85 xlsx
     49 doc
     46 db
     31 docx
     24 jpg
     22 pub
     10 pdf
      4 msg
      2 xml
      2 txt
      1 zip
      1 thmx
      1 htm
      1 /ic

      

I would like to run a script from \parent

in all folders inside \parent

and get output like this:

+-------+------+--------+
| count | ext  | folder |
+-------+------+--------+
|    91 | xls  | a      |
|    85 | xlsx | a      |
|    49 | doc  | a      |
|    46 | db   | a      |
|    31 | docx | a      |
|    24 | jpg  | a      |
|    22 | pub  | a      |
|    10 | pdf  | a      |
|     4 | msg  | a      |
|    98 | jpg  | b      |
|    92 | pub  | b      |
|    62 | pdf  | b      |
|     2 | xml  | b      |
|     2 | txt  | b      |
|     1 | zip  | b      |
|     1 | thmx | b      |
|     1 | htm  | b      |
|     1 | /ic  | b      |
|    66 | txt  | c      |
|    48 | msg  | c      |
|    44 | xml  | c      |
|    30 | zip  | c      |
|    12 | doc  | c      |
|     6 | db   | c      |
|     6 | docx | c      |
|     3 | jpg  | c      |
+-------+------+--------+

      

How can I accomplish this using bash?

+3


source to share


1 answer


Put it in a script, make it executable: chmod +x script.sh

and run it with./script.sh

#!/bin/sh

find . -type f -mtime -45 2>/dev/null \
    | sed 's|^\./\([^/]*\)/|\1/|; s|/.*/|/|; s|/.*.\.| |p; d' \
    | sort | uniq -ic \
    | sort -b -k2,2 -k1,1rn \
    | awk '
BEGIN{ 
    sep = "+-------+------+--------+"
    print sep "\n| count | ext  | folder |\n" sep
}

{ printf("| %5d | %-4s | %-6s |\n", $1, $3, $2) }

END{ print sep }'

      

  • sed 's|^\./\([^/]*\)/|\1/|; s|/.*/|/|; s|/.*.\.| |p; d'

    • s|^\./\([^/]*\)/.*/|\1 |

      replaces ./a/file.xls

      with a/file.xls

      .
    • s|/.*/|/|

      replaces b/some/dir/file.mp3

      with b/file.mp3

      .
    • s|/.*.\.| |p

      replaces a file.xls

      with a xls

      if s///p

      successful, then it also prints to stdout (to avoid files without extension).
    • d

      deletes the line (to avoid printing matches (again) or mismatched lines).
  • sort | uniq -ic

    counts each extension and directory name group.

  • sort -b -k2,2 -k1,1rn

    sorts first by catalog (field 2), small → large, and then by count (field 1) in reverse order (large → small) and numerically. -b

    forces you to sort(1)

    ignore spaces (spaces / tabs).

  • the last part of awk is pretty printing the output, maybe you want to put that in a separate script.



If you want to see how each filter filters the results, just try deleting them and you will see the result.

You can find good tutorials here about sh / awk / sed etc.

http://www.grymoire.com/Unix/

+5


source







All Articles