Removing duplicate files with the same size in a shell script

Question

Removing duplicate files with the same size in a shell script

I have a directory that has several files with the same content but different names, the only criterion I decided to remove duplicates was to sort them by size and then delete those that are the same size, for example when I am typing

 find . -type f -printf "%p - %s\n" | uniq -D -f1 | sort -nr -k3

I get

   ./abc.txt - 595
   ./acd.txt - 595
   ./dbc.txt - 595
   ./jed.txt - 595
   ./end.txt - 595
   ./wtw.txt - 595
   ./hds.txt - 595
   ./dkd.txt - 523
   ./kjk.txt - 523

I would like to keep only

   ./abc.txt 
   ./dkd.txt

+3

shell duplicates

AishwaryaKulkarni Dec 16 14 at 18:11

source to share

2 answers

You can copy one version of the duplicates here:

find . -type f -printf "%p - %s\n" | uniq -d -f1 | cut -d' ' -f1 | xargs -I{} cp {} /path/to/dir

0

stu Dec 16 14 at 18:29

source to share

Armali · Accepted Answer · 2017-03-10T10:54:13+0000

find . -type f -printf "%p - %s\n" | uniq -D -f1 | sort -nr -k3

uniq

sorting of the input is required, so you'll have to put sort

before that.
The option uniq

-D

is out of place here.
The parameter sort

-u

can perform the task uniq

.

find . -type f -printf "%p - %s\n" | sort -nru -k3

Removing duplicate files with the same size in a shell script

More articles: