How to delimit a file with "\ t \ n" on Mac
I have a document with lines separated by "\ t \ n". Entries are separated by either "\ t" or "\ n".
Typically this should be an awk request for straigtforward:
BEGIN {
RS='\t\n';
}
{
print;
print "Next entry:";
}
However, on a Mac, regular expressions, it seems, are not supported (maybe I'm not doing anything right?) So I tried RS="\t\n"
; however this is interpreted as RS='\t | \n'
. Similar problems with awk from the command line:
awk 1 RS='\t\n' ORS='abc' input > output
replaces \t
' s but leaves \n
be.
Next try: use tr
. This obviously is not satisfied for a sequence of more than one character, as \t
and \n
used individually in rows.
Further:
sed -e '/\t\n/s//NextEntry:/g' input > output
However does not work. Entering any sequence of ASCII characters instead of \ t \ n works.
Read the manual. It says it is \t
not supported on sed lines. Fair enough
sed -e '/\x9\xa/s//abc/' input > output
Still not working. Idea: use tr
to replace \t
and \n
characters not used in the input file, use sed
to change them to what I want and then tr
to change the remaining characters to what they should be.
tr: Illegal byte sequence
It turns out that the symbol f6
does tr
just completely fail.
Went through sentences in Sed not recognizing \ t, instead it treats it as 't' why? ... This might work for replacing lines of output (other than "Insert tabs into command line with CTRL + V" - the shell just rejected this paste.) But didn't seem to help in my case.
Is it because it's a Mac? Maybe because the text I'm looking for is not replacing? Maybe it's a combination with \n
?
Any other suggestions?
UPDATE:
I found thread How to replace newline (\ n) with sed? ... Apparently I can't even replace \n
with the string "abc" using the suggestions in this thread.
EDIT: Sixth head of the original file:
5a 20 4e 4f 09 0a 41 53 20 4f 46 20 30 31 2d 30
34 2d 30 35 20 45 4d 50 4c 4f 59 45 45 0a 47 52
4f 55 50 09 48 49 52 45 20 44 41 54 45 09 53 41
4c 41 52 59 09 4a 4f 42 20 54 49 54 4c 45 09 0a
4a 4f 42 20 4c 45 56 45 4c 0a 53 45 52 49 45 53
09 41 50 50 54 20 54 59 50 45 09 0a 50 41 59 20
53 54 41 54 55 53 0a f6
source to share
Unfortunately BSD awk
, also used on macOS, does not support multi-character record separators ( RS
) at all (per POSIX) - only one, literal character is supported.
BSD sed
, also used by macOS, only supports \n
regular expressions - any other screens, including hexadecimal ones (for example \x09
), are not supported.
See this answer for a comprehensive comparison between GNU and BSD sed
.
Assuming your command sed
works in principle, you can use ANSI C quoting string
( $'\t'
) to splice a literal tab char. in a sed
script (assuming bash
(default shell macOS), ksh
or zsh
),:
sed -e ':a' -e '$!{N;ba' -e '}' -e '/'$'\t''\n/s//NextEntry:/g'
Note that to replace newlines, you must instruct sed
to read the entire file into memory first, which is what it does -e ':a' -e '$!{N;ba' -e '}'
(BSD-compliant form of the GNU generic expression sed
:a;$!{N;ba}
).
source to share