Regular expression excludes non-word characters but leaves spaces

Question

Regular expression excludes non-word characters but leaves spaces

I am trying to write Regex

to stop using invalid character input in a zip code field.

from this link I was able to exclude all the "no word" characters.

Regex regex = new Regex(@"[\W_]+");
string cleanText = regex.Replace(messyText, "").ToUpper();

But this also excludes the "Space" symbols.

I'm sure this is possible, but I find the regex very confusing!

Can anyone help with an explanation of the regex pattern used?

+3

c # regex regex-negation

user1 June 13. 17 at 12:03

source to share

3 answers

Assuming valid postcodes only contain an alphanumeric character, you can replace anything other than alphanumeric characters and spaces with an empty string:

Regex regex = new Regex(@"[^a-zA-Z0-9\s]");
string cleanText = regex.Replace(messyText, "").ToUpper();

Note that this \s

includes tabs, newlines, and other unusable character. You may not want to consider them valid. In this case, just list the whitespace character literally:

[^a-zA-Z0-9 ]

+3

Dmitry Egorov June 13. 17 at 12:07

source to share

You can invert your character class to make it a negative character class like this:

[^\sa-zA-Z0-9]+

This will match any character other than a whitespace or alphanumeric character.

RegEx Demo (since it is not a .NET regex)

0

anubhava June 13. 17 at 12:07

source to share

Wiktor Stribiżew · Accepted Answer · 2017-06-13T12:08:04+0000

You can use character class subtraction :

[\W_-[\s]]+

It matches one or more non-word and underscore characters, excluding whitespace.

Regular expression excludes non-word characters but leaves spaces

More articles: