Regex.Split White Space
string pattern = @"(if)|(\()|(\))|(\,)";
string str = "IF(SUM(IRS5555.IRs001)==IRS5555.IRS001,10,20)";
string[] substrings = Regex.Split(str,pattern,RegexOptions.IgnorePatternWhitespace | RegexOptions.IgnoreCase) ;
foreach (string match in substrings)
{
Console.WriteLine("Token is:{0}", match);
}
And it came out
Token is:
Token is:IF
Token is:
Token is:(
Token is:SUM
Token is:(
Token is:IRS5555.IRs001
Token is:)
Token is:==IRS5555.IRS001
Token is:,
Token is:10
Token is:,
Token is:20
Token is:)
Token is:
As you can see the empty string at 1,3 and the last token, I cannot figure out why such a result, there is no empty string in my given string.
I do not want this result
+3
source to share
2 answers
try this:
string pattern = @"(if)|(\()|(\))|(\,)";
string str = "IF(SUM(IRS5555.IRs001)==IRS5555.IRS001,10,20)";
var substrings = Regex.Split(str, pattern, RegexOptions.IgnoreCase).Where(n => !string.IsNullOrEmpty(n));
foreach (string match in substrings)
{
Console.WriteLine("Token is:{0}", match);
}
+4
source to share
This is because "IF" and "(" are delimiters, and because there is nothing to the left of "IF", and nothing between "IF" and "(" you get these two empty entries. Remove "IF" from the pattern.
string pattern = @"(\()|(\))|(\,)";
UPDATE
You can search for tokens instead of splitting the string
var matches = Regex.Matches(str, @"\w+|[().,]|==");
This returns the token characters of your text.
string[] array = matches.Cast<Match>().Select(m => m.Value).ToArray();
[0]: "IF" [1]: "(" [2]: "SUM" [3]: "(" [4]: "IRS5555" [five]: "." [6]: "IRs001" [7]: ")" [8]: "==" [9]: "IRS5555" [ten]: "." [11]: "IRS001" [12]: "," [13]: "10" [fourteen]: "," [15]: "20" [sixteen]: ")"
UPDATE
Another pattern Regex
you can try along with Regex.Split
is
@"\b"
It will split text at word boundaries
[0]: "" [1]: "IF" [2]: "(" [3]: "SUM" [4]: "(" [5]: "IRS5555" [6]: "." [7]: "IRs001" [8]: ") ==" [9]: "IRS5555" [ten]: "." [11]: "IRS001" [12]: "," [13]: "10" [fourteen]: "," [15]: "20" [sixteen]: ")"
+2
source to share