Comparing rowset and file using Perl
I am trying to write a Perl script that will detect the difference between a set of lines and a file, and I want to print the content of a file that does not match the lines.
My INPUT1 will look like: (rowset)
AAAAA
BBBBB
CCCCC
DDDDD
EEEEE --- These are user ids which should be passed in the script
My INPUT2 will be a User.txt file with multiple IDs, including the ones above.
ABBAAA
ACARVAV
AAAAA
BBBBB
CCCCC
DDDDD
EEEEE
BGATA
ETYUIOL
I want my output to be like
ABBAAA
ACARVAV
BGATA
ETYUIOL
So far I have reached
my @things_to_find = qw(AAAAAA BBBBB CCCCC DDDDD EEEEE);
my $comparefile = "User.txt";
open ( my $compare_filehandle, "<", $comparefile ) or die $!;
while ( my $line = <$compare_filehandle> )
{
foreach my $thing ( @things_to_find )
{
print "Match found with: $line" if $line !~ /$thing/;
}
}
But this does not give the desired result. I am very new to Perl so any suggestions from you will be very helpful to me.
source to share
Try:
use List::Util qw(none);
my @things_to_find = qw(AAAAAA BBBBB CCCCC DDDDD EEEEE);
my $comparefile = "User.txt";
open ( my $compare_filehandle, "<", $comparefile ) or die $!;
while ( my $line = <$compare_filehandle> )
{
print $line if none { $line =~ /\b$_\b/} @things_to_find;
}
Doc List :: Util
source to share
You can try this simple one grep
for a matched pattern.
use strict;
use warnings;
use autodie;
my @users = qw(AAAAAA BBBBB CCCCC DDDDD EEEEE);
my $file = "User.txt";
open my $fh, "<", $file;
while ( my $line = <$fh> ) {
chomp $line;
print "Matched line : $line\n" unless grep {$line eq $_} @users;
}
Note:
Use grep
and is map
better than foreach
or to search for
.
source to share
In a way, your own code finds all the lines in the file that don't contain all the lines in the list, when it shouldn't equal any of those lines. You need to change the tightness test to the equality test; skip the line as soon as a match is found; and use chomp
to remove trailing newline from lines read from file.
There are two obvious ways to write this. The first is to create a hash, which is actually an array that is indexed with a string instead of an integer. If you fill in the hash using the entries from the file, then remove the ones in the string array. It will look like
use strict;
use warnings;
my $comparefile = 'User.txt';
my @users = qw/ AAAAA BBBBB CCCCC DDDDD EEEEE /;
open my $users_fh, '<', $comparefile or die $!;
my %file_users;
while (my $user = <$users_fh> ) {
chomp $user;
$file_users{$user} = 1;
}
delete $file_users{$_} for @users;
print "$_\n" for sort keys %file_users;
Output
ABBAAA
ACARVAV
BGATA
ETYUIOL
Another way is to create a regular expression from strings and use it to select those lines from the file to ignore. It will look like below and the result is identical to the previous program. This solution will be faster, but includes some more advanced ideas like regular expressions and map
so you might prefer the former.
use strict;
use warnings;
my $comparefile = 'User.txt';
my @users = qw/ AAAAA BBBBB CCCCC DDDDD EEEEE /;
my $re = join '|', map "^\Q$_\E\$", @users;
$re = qr/$re/;
open my $users_fh, '<', $comparefile or die $!;
my @file_users;
while (my $user = <$users_fh> ) {
chomp $user;
push @file_users, $user unless $user =~ $re;
}
print "$_\n" for sort @file_users;
source to share