.NET Regex Negative Lookahead for upper letters

Attempting to parse an expression for a custom .NET application to retrieve zip codes from addresses.

Addresses are on one line

12345 Example Street, NY 10019 United States

      

The following expression is used

\d{3,5}-\d{3,5}|\d{5}(?![A-Z]{2})

      

but it looks like it fetches both the 12345

zip code 10019

. Given that I only mentioned two lowercase letters in negative notation, shouldn't you consider just the postal code, which is preceded by the two-letter NY code? What am I doing wrong here?

I am using the operator |

as the zip codes are in 12345-12345

and also 12345

format

Please check the regex I am testing here

+3


source to share


1 answer


You can use lookbehind here:

\d{3,5}-\d{3,5}|(?<=[A-Z]{2}\s+)\d{5}

      

See regex demo

It (?<=[A-Z]{2}\s+)

requires 2 uppercase letters followed by 1 or more spaces up to 5 digits.



To make sure you match the specified number of digits, you can use word boundaries \b

:

\b(?:\d{3,5}-\d{3,5}|(?<=[A-Z]{2}\s+)\d{5})\b

      

See another demo .

+2


source







All Articles