How do I compare the overlap of file sizes between duplicate directories?
I need to compare two directories to validate a backup.
Say my directory looks like this:
Filename Filesize Filename Filesize
user@main_server:~/mydir/ user@backup_server:~/mydir/
file1000.txt 4182410737 file1000.txt 4182410737
file1001.txt 8241410737 - <-- missing on backup_server!
... ...
file9999.txt 2410418737 file9999.txt 1111111111 <-- size != main_server
Is there a quick one liner that would bring me closer to conclusion, like this:
Invalid Backup Files:
file1001.txt
file9999.txt
(for the purpose of instructing the backup script to update these files)
I have tried getting options for the following to no avail.
[main_server] $ rsync -n ~/mydir/ user@backup_server:~/mydir
I cannot do rsync
to back up the directories themselves because it takes too long (8-24 hours). Instead, I run multiple threads scp
to extract files in batches. This completes regularly <1 h. However, sometimes I find multiple files that have been missed somehow (connection may have been disconnected).
Speed ββis a priority, so file sizes must be adequate. But I'm open to including checksum
if it doesn't slow down the process as I find with help rsync
.
Here's my test process:
# Generate Large Files (1GB)
for i in {1..100}; do head -c 1073741824 </dev/urandom >foo-$i ; done
# SCP them from src to dest
for i in {1..100}; do ( scp ~/mydir/foo-$i user@backup_server:~/mydir/ & ) ; sleep 0.1 ; done
# Confirm destination has everything from source
# This is the point of the question. I've tried:
rsync -Sa ~/mydir/ user@backup_server:~/mydir
# Way too slow
What do you recommend?
By default, rsync uses the quick check method, which only transfers files that differ in size or when they were last modified. Since you are reporting that the dimensions are not resizing, this seems to indicate that the timestamps are different. There are two options:
-
Use
-p
to preserve timestamps when transferring files. -
Use
--size-only
to ignore timestamps and transfer only files that differ in size.