How to delimit a file with "\ t \ n" on Mac

I have a document with lines separated by "\ t \ n". Entries are separated by either "\ t" or "\ n".

Typically this should be an awk request for straigtforward:

BEGIN {
   RS='\t\n';
}
{
   print;
   print "Next entry:";
}

      

However, on a Mac, regular expressions, it seems, are not supported (maybe I'm not doing anything right?) So I tried RS="\t\n"

; however this is interpreted as RS='\t | \n'

. Similar problems with awk from the command line:

awk 1 RS='\t\n' ORS='abc' input > output

      

replaces \t

' s but leaves \n

be.

Next try: use tr

. This obviously is not satisfied for a sequence of more than one character, as \t

and \n

used individually in rows.

Further:

sed -e '/\t\n/s//NextEntry:/g' input > output

      

However does not work. Entering any sequence of ASCII characters instead of \ t \ n works.

Read the manual. It says it is \t

not supported on sed lines. Fair enough

sed -e '/\x9\xa/s//abc/' input > output

      

Still not working. Idea: use tr

to replace \t

and \n

characters not used in the input file, use sed

to change them to what I want and then tr

to change the remaining characters to what they should be.

tr: Illegal byte sequence

      

It turns out that the symbol f6

does tr

just completely fail.

Went through sentences in Sed not recognizing \ t, instead it treats it as 't' why? ... This might work for replacing lines of output (other than "Insert tabs into command line with CTRL + V" - the shell just rejected this paste.) But didn't seem to help in my case.

Is it because it's a Mac? Maybe because the text I'm looking for is not replacing? Maybe it's a combination with \n

?

Any other suggestions?

UPDATE:

I found thread How to replace newline (\ n) with sed? ... Apparently I can't even replace \n

with the string "abc" using the suggestions in this thread.

EDIT: Sixth head of the original file:

5a 20 4e 4f 09 0a 41 53  20 4f 46 20 30 31 2d 30
34 2d 30 35 20 45 4d 50  4c 4f 59 45 45 0a 47 52  
4f 55 50 09 48 49 52 45  20 44 41 54 45 09 53 41 
4c 41 52 59 09 4a 4f 42  20 54 49 54 4c 45 09 0a  
4a 4f 42 20 4c 45 56 45  4c 0a 53 45 52 49 45 53  
09 41 50 50 54 20 54 59  50 45 09 0a 50 41 59 20  
53 54 41 54 55 53 0a f6

      

+3


source to share


1 answer


Unfortunately BSD awk

, also used on macOS, does not support multi-character record separators ( RS

) at all
(per POSIX) - only one, literal character is supported.

BSD sed

, also used by macOS, only supports \n

regular expressions
- any other screens, including hexadecimal ones (for example \x09

), are not supported.
See this answer for a comprehensive comparison between GNU and BSD sed

.

Assuming your command sed

works in principle, you can use ANSI C quoting string  ( $'\t'

) to splice a literal tab char. in a sed

script
(assuming bash

(default shell macOS), ksh

or zsh

),:



sed -e ':a' -e '$!{N;ba' -e '}' -e '/'$'\t''\n/s//NextEntry:/g'

      

Note that to replace newlines, you must instruct sed

to read the entire file into memory first, which is what it does -e ':a' -e '$!{N;ba' -e '}'

(BSD-compliant form of the GNU generic expression sed

:a;$!{N;ba}

).

+3


source







All Articles