Removing duplicate files with the same size in a shell script

I have a directory that has several files with the same content but different names, the only criterion I decided to remove duplicates was to sort them by size and then delete those that are the same size, for example when I am typing

 find . -type f -printf "%p - %s\n" | uniq -D -f1 | sort -nr -k3

      

I get

   ./abc.txt - 595
   ./acd.txt - 595
   ./dbc.txt - 595
   ./jed.txt - 595
   ./end.txt - 595
   ./wtw.txt - 595
   ./hds.txt - 595
   ./dkd.txt - 523
   ./kjk.txt - 523

      

I would like to keep only

   ./abc.txt 
   ./dkd.txt

      

+3


source to share


2 answers


find . -type f -printf "%p - %s\n" | uniq -D -f1 | sort -nr -k3

      

  • uniq

    sorting of the input is required, so you'll have to put sort

    before that.

  • The option uniq

    -D

    is out of place here.

  • The parameter sort

    -u

    can perform the task uniq

    .




find . -type f -printf "%p - %s\n" | sort -nru -k3

      

+1


source


You can copy one version of the duplicates here:



find . -type f -printf "%p - %s\n" | uniq -d -f1 | cut -d' ' -f1 | xargs -I{} cp {} /path/to/dir

      

0


source







All Articles