Find a word before a specific phrase in a line
I'm trying to find the left word before a specific phrase "is better"
in all of these cases except for input 3 :
string input = "I think that green bike is better than the red bike"; // input 1
string input = "I think that green bike is better"; // input 2
string input = "is better than the red one"; // input 3
string input = "bike is better"; // input 4
I have tried three ways, but none of these methods gives me the desired result, which should only find the left word, in this case that word "bike"
before searching for the phrase "is better"
in all three input cases except input 3 and without the search phrase itself:
1)
var matches = Regex.Matches(input, @"(?:\S+\s)?\S*is better\S*(?:\s\S+)?", RegexOptions.IgnoreCase);
var list = matches.Cast<Match>().Select(match => match.Value).ToList();
foreach (string x in list)
{
Console.WriteLine("1) " + x);
}
2)
var regex = new Regex(@"(?:is better\s)(?<word>\b\S+\b)");
var matchCollection = regex.Matches(input);
foreach (Match match in matchCollection)
{
Console.WriteLine("2) " + match.Groups["word"].Value);
}
3)
string pattern = @"(?<before>\w+) is better (?<after>\w+)";
MatchCollection matche = Regex.Matches(input, pattern);
for (int i = 0; i < matche.Count; i++)
{
Console.WriteLine("3) before: " + matche[i].Groups["before"].ToString());
Console.WriteLine("3) after: " + matche[i].Groups["after"].ToString());
}
With the results of input 1 "I think that green bike is better than the red bike"
::
1) bike is better than
2) than
3) before: bike
3) after: than
So the result 1)
is the left and right words of the phrase "is better"
. The result 2)
is the word "then"
after "is better"
. And the result 3)
is again the words before and after, exactly what I can use, but the problem with this solution is shown in the second results.
With input value 2:"I think that green bike is better"
Result:
1) bike is better
The result 1)
is the word "bike"
before the phrase "is better"
, but with the search phrase "is better"
. The result 2)
is nothing as it looks for the word after "is better"
, so it is correct as it is. And the result is 3)
also irrelevant, even if the word "bike"
exists before "is better"
in case the word after "is better"
does not exist, and these are the last words in the line.
From input 3: "is better than the red one"
results:
1) is better than
2) than
The result 1)
is the correct word after "is better"
, because the left word does not exist before, and again with a phrase "is better"
. And the result 1)
is the word "then"
after "is better"
.
And the result with input 4 "bike is better"
::
1) bike is better