Concatenate similar strings using regular expression in perl

Question

Concatenate similar strings using regular expression in perl

Hey. I am in the process of sorting a file and I want to make it easier to read by concatenating similar lines. The data is already sorted alphabetically based on the first word in each line. My program so far only reads the lines in the array and prints it.

The text file contains:

Network ubuntu Jan  1 13:42:13 : <info>  DHCP: device eth5 state changed bound -> renew Network
Network ubuntu Jan  2 13:42:42 : <info>    prefix 24 (255.255.255.0) Network
Network ubuntu Jan  2 12:11:42 : <info>  DHCP: device eth5 state changed bound -> renew Network
testing ubuntu Jan  1 01:13:42 : DHCPACK of 192.168.233.129 from 192.168.233.254 testing
testing ubuntu Jan  2 13:54:42 : DHCPACK of 192.168.233.129 from 192.168.233.254 testing
testing ubuntu Jan  3 13:02:42 : DHCPACK of 192.168.233.129 from 192.168.233.254 testing

My program:

#!/usr/bin/perl 

$FILE = '/computer/testfile.txt';
open(INFO, $FILE);

while($line = <INFO>){
    push(@array, $line);
}

print @array;

I want to use a regex to concatenate any strings that are identifiers except for a date / time stamp. The result is to contain the number of lines, grouped in parentheses, followed by the earliest and most recent date and time stamps. If there are no similar lines, it ignores the line, leaving it unchanged.

Expected end result

Network ubuntu Jan  2 13:42:42 : <info>    prefix 24 (255.255.255.0) Network
Network ubuntu (2) Jan  1 13:42:13-Jan  2 12:11:42: <info>  DHCP: device eth5 state changed bound -> renew Network
testing ubuntu (3) Jan  1 01:13:42-Jan  3 13:02:42 : DHCPACK of 192.168.233.129 from 192.168.233.254 testing

Any help or guidance would be greatly appreciated. Thanks to

+3

arrays merge regex perl

user3423572 03 jan. 15 at 0:08

source to share

1 answer

choroba · Answer 1 · 2015-01-03T01:28:02+0000

You can use Time :: Piece to parse dates. Note that without a year, you cannot sort the timestamps.

Just read the sorted file line by line. If the information is the same as on the previous line, copy the timestamp, otherwise output the previous accumulated information and start accumulating new one.

#!/usr/bin/perl
use warnings;
use strict;

use Time::Piece;


sub output {
    my ($pre, $post, @timestamps) = @_;
    if (@timestamps > 1) {
        @timestamps = map $_->[0],                 # Use Schwartzian Transform to sort by timestamp.
                      sort { $a->[1] <=> $b->[1] }
                      map [ $_, 'Time::Piece'->strptime($_, '%b %d %H:%M:%S') ],
                      @timestamps;
        print "$pre (", scalar @timestamps, ") ",
              $timestamps[0], '-', $timestamps[-1],
              $post, "\n";
    } else {
        print "$pre$timestamps[0]$post\n";
    }
}


my @last;
my @timestamps;
while (<>) {
    my ($pre, $timestamp, $post)
        = /(.*?) ([ADFJMNOS][aceopu][bcglnprtvy]\s+[0-9]+\s[0-9:]+) (.*)/x;

    if (@last and $pre eq $last[0] and $post eq $last[1]) {
        push @timestamps, $timestamp;

    } else {
        output(@last, @timestamps) if @timestamps;
        @last = ($pre, $post);
        @timestamps = ($timestamp);
    }
}
output(@last, @timestamps); # Don't forget to output the last batch.

Concatenate similar strings using regular expression in perl

More articles: