Awk | merge line based on field mapping

I need help with the following:

Input file:

abc message=sent session:111,x,y,z
pqr message=receive session:111,4,5,7
abc message=sent session:123,x,y,z
pqr message=receive session:123,4,5,7
abc message=sent session:342,x,y,z
abc message=sent session:589,x,y,z
pqr message=receive session:589,4,5,7

      

Output file:

abc message=sent session:111,x,y,z, pqr message=receive session:111,4,5,7
abc message=sent session:123,x,y,z, pqr message=receive session:123,4,5,7
abc message=sent session:342,x,y,z, NOMATCH
abc message=sent session:589,x,y,z, pqr message=receive session:589,4,5,7

      

Notes:

If you see in the source file, for every "sent" message there is a "receive"
for session only = 342 no response
session unknown, cannot be hardcoded
So combine only those sent and received where we have the corresponding session number

+3


source to share


2 answers


Another way:

awk -F "[:,]"  '/=sent/{a[$2]=$0;}/=receive/{print a[$2], $0;delete a[$2];}END{for(i in a)print a[i],"NO MATCH";}' file

      

Results:



abc message=sent session:111,x,y,z pqr message=receive session:111,4,5,7
abc message=sent session:123,x,y,z pqr message=receive session:123,4,5,7
abc message=sent session:589,x,y,z pqr message=receive session:589,4,5,7
abc message=sent session:342,x,y,z NO MATCH

      

When a record is encountered send

, it is an array in an array with the session id as the index. When a record is encountered receive

, the record send

is retrieved from the array and printed along with the record receive

. In addition, sent records are removed from the array as and when receive

records are received . In END, all other entries in the array are printed as NO MATCH.

+1


source


Here is one way: awk

. Run as:

awk -f script.awk file

      

Contents script.awk

:

{
    x = $0

    gsub(/[^:]*:|,.*/,"")

    a[$0] = (a[$0] ? a[$0] "," FS : "") x
    b[$0]++
}

END {
    for (i in a) {
        print (b[i] == 2 ? a[i] : a[i] "," FS "NOMATCH") | "sort"
    }
}

      

Results:



abc message=sent session:111,x,y,z, pqr message=receive session:111,4,5,7
abc message=sent session:123,x,y,z, pqr message=receive session:123,4,5,7
abc message=sent session:342,x,y,z, NOMATCH
abc message=sent session:589,x,y,z, pqr message=receive session:589,4,5,7

      

Alternatively, here's a one-liner:

awk '{ x = $0; gsub(/[^:]*:|,.*/,""); a[$0] = (a[$0] ? a[$0] "," FS : "") x; b[$0]++ } END { for (i in a) print (b[i] == 2 ? a[i] : a[i] "," FS "NOMATCH") | "sort" }' file

      

Note that you can drop the pipe before sort

if you don't need the sorted output. NTN.

+1


source







All Articles