.NET Regex Negative Lookahead for upper letters
Attempting to parse an expression for a custom .NET application to retrieve zip codes from addresses.
Addresses are on one line
12345 Example Street, NY 10019 United States
The following expression is used
\d{3,5}-\d{3,5}|\d{5}(?![A-Z]{2})
but it looks like it fetches both the 12345
zip code 10019
. Given that I only mentioned two lowercase letters in negative notation, shouldn't you consider just the postal code, which is preceded by the two-letter NY code? What am I doing wrong here?
I am using the operator |
as the zip codes are in 12345-12345
and also 12345
format
Please check the regex I am testing here
source to share
You can use lookbehind here:
\d{3,5}-\d{3,5}|(?<=[A-Z]{2}\s+)\d{5}
See regex demo
It (?<=[A-Z]{2}\s+)
requires 2 uppercase letters followed by 1 or more spaces up to 5 digits.
To make sure you match the specified number of digits, you can use word boundaries \b
:
\b(?:\d{3,5}-\d{3,5}|(?<=[A-Z]{2}\s+)\d{5})\b
See another demo .
source to share