Replacing multiline vim regex from a range

I am trying to reformat a hierarchical (xml) file to a "per line" file using vim.

Here's a simplified example. The real case is "big" (500k rows) and records and groups are arbitrary counts.

input file:

<group key="abc">
  <entry val="1"/>
  <entry val="2"/>
  <entry val="3"/>
</group>
<group key="xyz">
  <entry val="1"/>
  <entry val="2"/>
  <entry val="3"/>
  <entry val="4"/>
  <entry val="5"/>
</group>

      

Output result:

abc,1
abc,2
abc,3
xyz,1
xyz,2
xyz,3
xyz,4
xyz,5

      

Note that I don't need one magic expression that does all of this (although it will be a wave). The part I'm struggling with is getting the key associated with each of the records. I'm sure there is a good idiom for this. Thanks in advance.

One thing I have tried and may be helpful for others is this:

:g/key="\(.*\)"/.;/<\/group/s/<entry /\1,<entry /g

      

which doesn't work because the range match doesn't carry over to the lookup. This expression essentially searches for pat1, builds a range from there to pat2, and then replaces pat3 with pat4 (but only within instances of pat1, pat2 range inclusive).

:g/pat1/.;/pat2/s/pat3/pat4/g

      

Decision

The best solution below solved it by looking for an entry and then back for a key, as opposed to what I was trying to do above by plotting a range and multiple lookups. Which ultimately required some minor modifications, so they are presented here for others. The commands for the heavy lift are as follows:

:g/entry/?key?,\?t.
:g/entry/norm ddpkJ
:v/entry/d

      

Structure:

Search all input lines:

:g/entry/

      

From there find the backlink for the key line and copy it under each entry.

?key?,\?t.

      

Search all input lines again and switch to normal editing

:g/entry/norm

      

Change two lines (remove the key line and insert it below the group line). Move to a line and join two lines.

ddpkJ

      

After all the keys are displayed, find any lines that DO NOT have an entry and delete them.

:v/entry/d

      

If you have multiple hierarchies like me, you can run the first two lines multiple times. Once everything is on the same line, it is fairly straightforward to clean it up to whatever final format is. Another major advantage is that this solution can be easily added to a script and repeated with

vim -S script.vim data.file

      

+3


source to share


2 answers


After that I worked

:g/entry/?<group?,?<group?t.
:%norm J
:g/<\//d
:%norm df"f"df"i,<C-v><Esc>f"d$

      

Structure

For each line containing entry

, search backwards for <group

and copy to the line below the entry

:g/entry/?<group?,?<group?t.

<group key="abc">
  <entry val="1"/>
<group key="abc">
  <entry val="2"/>
<group key="abc">
  <entry val="3"/>
<group key="abc">
</group>
<group key="xyz">
  <entry val="1"/>
<group key="xyz">
  <entry val="2"/>
<group key="xyz">
  <entry val="3"/>
<group key="xyz">
  <entry val="4"/>
<group key="xyz">
  <entry val="5"/>
<group key="xyz">
</group>

      

Append all lines



:%norm J

<group key="abc"> <entry val="1"/>
<group key="abc"> <entry val="2"/>
<group key="abc"> <entry val="3"/>
<group key="abc"> </group>
<group key="xyz"> <entry val="1"/>
<group key="xyz"> <entry val="2"/>
<group key="xyz"> <entry val="3"/>
<group key="xyz"> <entry val="4"/>
<group key="xyz"> <entry val="5"/>
<group key="xyz"> </group>

      

Remove closing tags

:g/<\//d

<group key="abc"> <entry val="1"/>
<group key="abc"> <entry val="2"/>
<group key="abc"> <entry val="3"/>
<group key="xyz"> <entry val="1"/>
<group key="xyz"> <entry val="2"/>
<group key="xyz"> <entry val="3"/>
<group key="xyz"> <entry val="4"/>
<group key="xyz"> <entry val="5"/>

      

Correct the remaining text by searching and removing the quotes in and out of them. Note that <C-v><Esc>

this is the key sequence that adds escape to your command.

:%norm df"f"df"i,<C-v><Esc>f"d$

abc,1
abc,2
abc,3
xyz,1
xyz,2
xyz,3
xyz,4
xyz,5

      

+1


source


Well this is not a magic line, but may work:

ggqq/groupf"lyi"<c-v>n0I<c-r>"<esc>ddnddq
100@q
:%s/\s*<entry val="/,/g
:%s/"\/>//g

      

Step by step:

gg       => Go to the top
qq       => Record a macro called q
/group   => Search for "group"
f"l      => Go to the key
yi"      => Copy the key
c-v      => Vertical visual mode
n0       => Go to the end of the "group", place the cursor at the beginning
I<c-r>"<esc> => Paste at the beginning
dd       => Delete <group> line
ndd      => Delete end </group> line
q        => Stop macro

100@q    => Play macro 100 times, use whatever you need

      



You should now have something like:

abc  <entry val="1"/>
abc  <entry val="2"/>
abc  <entry val="3"/>
xyz  <entry val="1"/>
xyz  <entry val="2"/>
xyz  <entry val="3"/>
xyz  <entry val="4"/>
xyz  <entry val="5"/>

      

Then just clean up what you don't need:

:%s/\s*<entry val="/,/g
:%s/"\/>//g

      

+1


source







All Articles