Using awk to split a line with multiple line terminators
I have a file pet_owners.txt that looks like this:
petOwner:Jane,petName:Fluffy,petType:cat
petOwner:John,petName:Oreo,petType:dog
...
petOwner:Jake,petName:Lucky,petType:dog
I would like to use awk to split a file with "petOwner", "petName" and "petType" delimiters so that I can retrieve pet owners and pet types. My desired output:
Jane,cat John,dog ... Jake,dog
So far I have tried:
awk < pet_owners.txt -F'['petOwner''petName''petType']' '{print $1 $3}'
but the result is a bunch of new lines.
Any ideas on how I can achieve this?
source to share
$ awk -F'[:,]' -v OFS=',' '{print $2,$6}' file
Jane,cat
John,dog
Jake,dog
As for why your attempt didn't work, it's mainly because [
both ]
in the regex context are "parenthesis" delimiters and that there is a set of characters inside it (which could be individual characters, ranges, lists and / or classes), so when you wrote:
-F'['petOwner''petName''petType']'
which would set FS
in character set p
, e
, t
and so on, rather than a set of strings petOwner
, etc. Multiple internals '
cancel each other out as you jump in / out of the shell for no reason, just like you would write -F'[petOwnerpetNamepetType]'
if there are no metacharacters in there that would expand the shell.
To set FS to a stringset (actually regexes, so watch out for metahars) would be:
-F'petOwner|petName|petType'
source to share
You can also define what the field is, rather than define what the delimiter is. To do this, you use the FPAT variable , for example:
~ $ awk '{ print $2,$6 }' FPAT="[^,:]+" OFS="," pet_owners.txt
Jane,cat
John,dog
In this way, you define as a field anything that is not a comma or colon.
Sometimes this makes the programs easier to operate.
OFS sets the output field separator to a comma.
source to share