Using sed, awk, etc. To split after midpoint characters

Question

Using sed, awk, etc. To split after midpoint characters

I could use your help for something; I promise I tried very hard to find answers but no luck.

I want to split the text between each occurrence of the "·" (midpoint) character (by syllables, mostly).

echo con·grat·u·late | sed -e 's/·.*$/·/1'

The above code:

· Con

This is the first part of what I want, but ultimately I would like to get:

Rogue
Grt
U
Late

This will involve getting characters between the 1st and 2nd occurrences of "·"

If anyone can lead me in the right direction, I would greatly appreciate it and I'll consider it myself anyway.

EDIT My apologies, I did not display the desired result correctly. However, your solution worked great.

Since it is important for me to keep everything as one line, how would I output the text between the first point and the second in order to output:

grat·

I do this in UTF-8 Jonathan

Once again, sorry for asking for the wrong thing.

+3

bash regex awk sed

TuxForLife Dec 23. 14 at 23:13

source to share

4 answers

In GNU sed, you can do this:

echo con·grat·u·late | sed -e 's/·/&\n/g'

&

denotes a matching pattern, in this example ·

. Unfortunately this does not work in BSD sed.

For a more portable solution, I recommend this AWK, which should work on both GNU and BSD systems:

echo con·grat·u·late | awk '{ gsub("·", "&\n") } 1'

+3

janos Dec 23. 14 at 23:21

source to share

You can use simple awk

to separate these words:

$ echo 'con.grat.u.late' | awk -F. '{print $1}'
con
$ echo 'con.grat.u.late' | awk -F. '{print $2}'
grat
$ echo 'con.grat.u.late' | awk -F. '{print $3}'
u
$ echo 'con.grat.u.late' | awk -F. '{print $4}'
late

$ echo 'con.grat.u.late' | awk -F. '{for(i=1;i<=NF;i++){print $i}}' 
con
grat
u
late

-F.

implies using .

as a field separator

+2

anishsane Dec 24. '14 at 11:25

source to share

Just

echo con·grat·u·late | sed -e 's/·/·\n/g'

which replaces each ·

with ·

, followed by a new line.

+1

Wintermute Dec 23. 14 at 23:18

source to share

repzero · Accepted Answer · 2014-12-24T01:12:07+0000

Since you want to run characters between dots, you can try sed like this

echo 'con.grat.u.late'|sed 's/\.*\./&\n/g'|sed  -n 2p|tr -d '.'

to print a group of characters between the 1st and 2nd dots

echo 'con.grat.u.late'|sed 's/\.*\./&\n/g'|sed  -n 2p|tr -d '.'

results

grat

Note: I am using 2p

to print characters between 1st point and 2nd point

print a group of characters between 2nd point and 3rd

echo 'con.grat.u.late'|sed 's/\.*\./&\n/g'|sed  -n 3p|tr -d '.'

results

Note: I am using 3p

to print characters between 2nd point and third point

You can do everything with sed as well, but I am using the command tr

so it will be easy for you to follow. The command tr

removes points before printing. If you want to use periods, then exclude |tr -d '.'

from the command line.

You can also print ranges of character groups

echo 'con.grat.u.late'|sed 's/\.*\./&\n/g'|sed  -n 1,3p|tr -d '.'

results

con
grat
u

Using sed, awk, etc. To split after midpoint characters

More articles: