Awk | merge line based on field mapping
I need help with the following:
Input file:
abc message=sent session:111,x,y,z
pqr message=receive session:111,4,5,7
abc message=sent session:123,x,y,z
pqr message=receive session:123,4,5,7
abc message=sent session:342,x,y,z
abc message=sent session:589,x,y,z
pqr message=receive session:589,4,5,7
Output file:
abc message=sent session:111,x,y,z, pqr message=receive session:111,4,5,7
abc message=sent session:123,x,y,z, pqr message=receive session:123,4,5,7
abc message=sent session:342,x,y,z, NOMATCH
abc message=sent session:589,x,y,z, pqr message=receive session:589,4,5,7
Notes:
If you see in the source file, for every "sent" message there is a "receive"
for session only = 342 no response
session unknown, cannot be hardcoded
So combine only those sent and received where we have the corresponding session number
source to share
Another way:
awk -F "[:,]" '/=sent/{a[$2]=$0;}/=receive/{print a[$2], $0;delete a[$2];}END{for(i in a)print a[i],"NO MATCH";}' file
Results:
abc message=sent session:111,x,y,z pqr message=receive session:111,4,5,7
abc message=sent session:123,x,y,z pqr message=receive session:123,4,5,7
abc message=sent session:589,x,y,z pqr message=receive session:589,4,5,7
abc message=sent session:342,x,y,z NO MATCH
When a record is encountered send
, it is an array in an array with the session id as the index. When a record is encountered receive
, the record send
is retrieved from the array and printed along with the record receive
. In addition, sent records are removed from the array as and when receive
records are received . In END, all other entries in the array are printed as NO MATCH.
source to share
Here is one way: awk
. Run as:
awk -f script.awk file
Contents script.awk
:
{
x = $0
gsub(/[^:]*:|,.*/,"")
a[$0] = (a[$0] ? a[$0] "," FS : "") x
b[$0]++
}
END {
for (i in a) {
print (b[i] == 2 ? a[i] : a[i] "," FS "NOMATCH") | "sort"
}
}
Results:
abc message=sent session:111,x,y,z, pqr message=receive session:111,4,5,7
abc message=sent session:123,x,y,z, pqr message=receive session:123,4,5,7
abc message=sent session:342,x,y,z, NOMATCH
abc message=sent session:589,x,y,z, pqr message=receive session:589,4,5,7
Alternatively, here's a one-liner:
awk '{ x = $0; gsub(/[^:]*:|,.*/,""); a[$0] = (a[$0] ? a[$0] "," FS : "") x; b[$0]++ } END { for (i in a) print (b[i] == 2 ? a[i] : a[i] "," FS "NOMATCH") | "sort" }' file
Note that you can drop the pipe before sort
if you don't need the sorted output. NTN.
source to share