Use the same line to test for regular expressions

I am new to regex but I learned a thing or two. I got into a problem that might not be possible to solve with a regular expression, so I need some advice.

I have the following line:

some text key 12, 32, 311 ,465 and 345. some other text dog 612, 
12, 32, 9 and 10. some text key 1, 2.

      

I'm trying to figure out if it's possible (using a regex only) to extract numbers 12

32

311

465

345

1

2

- as a collection of individual matches.

When I approach this problem, I have tried to find a pattern that matches only the relevant results. So I came up with:

  • get numbers prefixed with "key" and NOT have the prefix "dog".

But I'm not sure if this is possible. I mean I know that for a number 1

I can use (?<=key )+[\d]+

and get it as a result, but for other numbers (i.e. 2..5

) I can use the prefix againkey

+3


source to share


4 answers


You can do this in steps 2

.

(?<=key\\s)\\d+(?:\\s*(?:,|and)\\s*\\d+)*

      

Capture all numbers. See demo.



https://regex101.com/r/uK9cD8/6

Then split

or extract \\d+

out of it. See demo.

https://regex101.com/r/uK9cD8/7

+1


source


In Java, you can use a constrained width view that accepts a {n,m}

constraint quantifier.

So you can use

(?<=key(?:(?!dog)[^.]){0,100})[0-9]+

      

Or, if key

and dog

are whole words, use a \b

word boundary:



String pattern = "(?<=\\bkey\\b(?:(?!\\bdog\\b)[^.]){0,100})[0-9]+";

      

The only problem can arise if the distance between dog

or key

and numbers is greater than m

. You can increase it to 1000 and I think this will work in most cases.

Example IDEONE demo

String str = "some text key 12, 32, 311 ,465 and 345. some other text dog 612,\n12, 32, 9 and 10. some text key 1, 2.";
String str2 = "some text key 1, 2, 3 ,4 and 5. some other text dog 6, 7, 8, 9 and 10. some text, key 1, 2 dog 3, 4 key 5, 6";
Pattern ptrn = Pattern.compile("(?<=key(?:(?!dog)[^.]){0,100})[0-9]+");
Matcher m = ptrn.matcher(str);
while (m.find()) {
   System.out.println(m.group(0));
}
System.out.println("-----");
m = ptrn.matcher(str2);
while (m.find()) {
   System.out.println(m.group(0));
}

      

+3


source


I would not recommend using code you cannot understand and configure, but here is my one-pass solution using the method described in this answer of mine . If you want to understand the construction method, read the other answer.

(?:key(?>\s+and\s+|[\s,]+)|(?!^)\G(?>\s+and\s+|[\s,]+))(\d+)

      

Compared to the method described in another post, I dropped the prediction as we don't need to check the suffix in that case.

Here is the separator (?>\s+and\s+|[\s,]+)

. It currently allows "and" with spaces on either side, or any combination of spaces and commas. I use (?>pattern)

to suppress the countdown, so the order of rotation is significant. Change it to (?:pattern)

if you want to change it and you don't know what you are doing.

Sample code:

String input = "some text key 12, 32, 311 ,465 and 345. some other text dog 612,\n12, 32, 9 and 10. some text key 1, 2. key 1, 2 dog 3, 4 key 5, 6. key is dog 23, 45. key 4";
Pattern p = Pattern.compile("(?:key(?>\\s+and\\s+|[\\s,]+)|(?!^)\\G(?>\\s+and\\s+|[\\s,]+))(\\d+)");
Matcher m = p.matcher(input);
List<String> numbers = new ArrayList<>();

while (m.find()) {
    numbers.add(m.group(1));
}

System.out.println(numbers);

      

Demo on an idea

+2


source


You can use a positive look and feel, which ensures that your sequence doesn't precede any word other than key

:

(?<=key)\s(?:\d+[\s,]+)+(?:and )?\d+

      

Note, here you don't need to use negative lookahead for dog

, because this regex will just match if your sequence is preceded by key

.

See demo https://regex101.com/r/gZ4hS4/3

+1


source







All Articles