Use non-ascii stripes, but allow currency symbols
I am using below regex to remove all non-ascii characters from string.
String pattern = @"[^\u0000-\u007F]";
Regex rx = new Regex(pattern, RegexOptions.Compiled);
rx.Replace(data," ");
However, I want to allow the use of curreny (pound symbol) and trademark symbols.
I changed the above regex as below and it works for me. Can anyone confirm if the regex is valid?
String pattern = @"[^\u0000-\u007F \p{Sc}]";
Basically, I want to allow all currency symbols as well.
source to share
Yes, your regex is correct.
What you are doing with your code is replacing the characters that match your regex with a blank character.
Now what characters match your regex?
Everything except:
- The range you specified:
0000-007F
- Symbols of currency symbols:
\p{Sc}
. See http://regular-expressions.info/unicode.html#prop
If you just want to keep some other symbols, yes, you can add them too (just like you did with \p{Sc}
.
Edit:
Be careful when doing this in the future. The regex would indeed be [^\u0000-\u007F\p{Sc}]
(no space), although it doesn't matter in this case since the space was already in the ASCII range.
source to share