C # regex extraction string enclosed in single quotes
I have the following line that I need to parse with RegEx.
abc = 'def' and size = '1 x(3\" x 5\")' and (name='Sam O\'neal')
This is a SQL filter that I would like to split into tokens using the following delimiters:
(, ), >,<,=, whitespace, <=, >=, !=
After the string has been parsed, I would like the output to be:
abc,
=,
def,
and,
size,
=,
'1 up(3\" x 5\")',
and,
(,
Sam O\'neal,
),
I've tried the following code:
string pattern = @"(<=|>=|!=|=|>|<|\)|\(|\s+)";
var tokens = new List<string>(Regex.Split(filter, pattern));
tokens.RemoveAll(x => String.IsNullOrWhiteSpace(x));
I'm not sure how to store the single quoted string as a single token. I am new to Regex and would appreciate any help.
+3
source to share
1 answer
Your template needs to be updated with another alternative branch: '[^'\\]*(?:\\.[^'\\]*)*'
.
It will match:
-
'
- single quote -
[^'\\]*
- symbols 0+, except'
and\
-
(?:
- sequences not associated with capturing a group:-
\\.
- any escape sequence -
[^'\\]*
- symbols 0+, except'
and\
-
-
)*
- zero or more cases -
'
- single quote
In C #:
string pattern = @"('[^'\\]*(?:\\.[^'\\]*)*'|<=|>=|!=|=|>|<|\)|\(|\s+)";
See regex demo
demo c # :
var filter = @"abc = 'def' and size = '1 x(3"" x 5"")' and (name='Sam O\'neal')";
var pattern = @"('[^'\\]*(?:\\.[^'\\]*)*'|<=|>=|!=|=|>|<|\)|\(|\s+)";
var tokens = Regex.Split(filter, pattern).Where(x => !string.IsNullOrWhiteSpace(x));
foreach (var tok in tokens)
Console.WriteLine(tok);
Output:
abc
=
'def'
and
size
=
'1 x(3" x 5")'
and
(
name
=
'Sam O\'neal'
)
+2
source to share