Using awk to split a line with multiple line terminators

I have a file pet_owners.txt that looks like this:

petOwner:Jane,petName:Fluffy,petType:cat
petOwner:John,petName:Oreo,petType:dog
...
petOwner:Jake,petName:Lucky,petType:dog

      

I would like to use awk to split a file with "petOwner", "petName" and "petType" delimiters so that I can retrieve pet owners and pet types. My desired output:

Jane,cat
John,dog
...
Jake,dog

      

So far I have tried:

awk < pet_owners.txt -F'['petOwner''petName''petType']' '{print $1 $3}'

      

but the result is a bunch of new lines.

Any ideas on how I can achieve this?

+3


source to share


3 answers


$ awk -F'[:,]' -v OFS=',' '{print $2,$6}' file
Jane,cat
John,dog
Jake,dog

      

As for why your attempt didn't work, it's mainly because [

both ]

in the regex context are "parenthesis" delimiters and that there is a set of characters inside it (which could be individual characters, ranges, lists and / or classes), so when you wrote:

-F'['petOwner''petName''petType']'

      



which would set FS

in character set p

, e

, t

and so on, rather than a set of strings petOwner

, etc. Multiple internals '

cancel each other out as you jump in / out of the shell for no reason, just like you would write -F'[petOwnerpetNamepetType]'

if there are no metacharacters in there that would expand the shell.

To set FS to a stringset (actually regexes, so watch out for metahars) would be:

-F'petOwner|petName|petType'

      

+3


source


you can also write delimiters in this form instead of char set



$ awk -F'pet(Owner|Name|Type):' '{print $2,$4}' file

Jane, cat
John, dog

Jake, dog

      

+1


source


You can also define what the field is, rather than define what the delimiter is. To do this, you use the FPAT variable , for example:

~ $ awk '{ print $2,$6 }' FPAT="[^,:]+" OFS="," pet_owners.txt
Jane,cat
John,dog

      

In this way, you define as a field anything that is not a comma or colon.

Sometimes this makes the programs easier to operate.

OFS sets the output field separator to a comma.

0


source







All Articles