Get first n characters of a regular expression

I want to get the first n

characters of the match from this regex:

(\d+\s*)

      

Basically, I want a right pane with spaces. So, in the lines:

12345␒␒␒␒␒␒␒␒123␒␒␒␒␒␒␒
123␒␒␒␒␒␒␒␒␒12345␒␒␒␒␒␒

      

I want to finish:

12345␒␒␒␒␒123␒␒␒␒␒␒␒
123␒␒␒␒␒␒␒12345␒␒␒␒␒

      

There are always two matches in a string, and strings are of constant length.

+3


source to share


2 answers


Multiple passes

Based on more information about the problem and its structure, I would suggest the following steps:

  • Split each line by two, right before the second pattern.
  • Take the part you want from each line.
  • Combine the lines so that the matches are on the original line.

This means something like this:

  • Replace ^(\d*\s*)(\d*\s*)$

    with $1\r\n$2

    . Just leave \r

    if you are not at the windows and I doubt it. Perhaps you should consider adding a macro to the end of the line. It should be something that is not included in the rest of the document (for example #

    ). $1

    means to replace the first captured group (stuff inside the brackets). So replace it with $1#\r\n$2

    .
  • Now take the desired length of each line: (^.{n}).*(#?)

    and replace it with $1$2

    . This will grab the first characters n

    and insert a macro if found.
  • Remove macros after macros #\r\n

    . Delete them or replace them with \0

    .

Notes



  • First, you need to filter the rows that match first (^\d*\s*)

    .
  • If you want another macro, enter occurrences #

    in the answer above. It shouldn't appear in the rest of the file, at least not at the end of the line.
  • This answer uses backlinks which should be no problem .

Single pass

One pass may be possible here.

^(\d[\d\s]{n-1})[^\d]*(\d[\d\s]{n-1}).*$

      

Matches these lines, if you extract one and two groups, this will filter the desired result from the file. Just replace it with $1$2

.

+2


source


Replace:

(\d[\d\s]{n-1})\s*

      

FROM



$1

      

This replaces a digit, followed by digits n-1

or whitespace, followed by any whitespace characters over the first characters n

of what was matched (so you should get 2 matches per string).

+1


source







All Articles