How do I find the index of the first occurrence of a certain type of character after some substring?
I have a string that I want to find where a specific substring occurs:
int startIndex = str.IndexOf(substr);
int endIndex = str.IndexOf(" ", startIndex);
In the example above, I found endIndex
by searching for space
which appears after startIndex
. This is wrong and this is just an example. I have a requirement to stop searching as soon as appear alphanumeric
or special characters
, except space
.
I know we can do this with Regex, but not trying to combine it with IndexOf
here in my this code. How can this be done or how to find the required one endIndex
?
source to share
Regex.Match
has a positional parameter. You can use it to search from a starting point in a string.
Here's an example. Note that the regular expression matches any character in the word. This assumes there was a typo in your post and you really want to delimit collisions with any non-alphanumeric or special characters.
string s = "This is an example, and it contains a comma.";
int startIndex = s.IndexOf("example");
Regex r = new Regex(@"[\w]+");
Match m = r.Match(s, startIndex);
int endIndex = m.Success ? m.Index + m.Length : -1;
If you really want to delimit, once you come across alphanumeric or special characters, change the regex pattern to [\s]+
.
source to share
First of all, if you want to find endIndex
after occurrence of a substring, then your current code has another disadvantage:
int startIndex = str.IndexOf(substr);
int endIndex = str.IndexOf(" ", startIndex);
you're looking to the endIndex
right of startIndex
. Suppose your STR and SUBSTR are:
pos: 0123456789012345678901234567890123456789012
str: The quick brown fox jumps over the lazy dog
sub: fox jumping
^ !
Here indexOf(sub)
returns 16
( ^
), and if you are looking for the space to the right of 16, you will end up with a space between fox
and jumping
( !
).
To start searching after a substring, you must start searching after the substring , not within the substring .
int startIndex = str.IndexOf(substr);
int endIndex = str.IndexOf(" ", startIndex + substr.Length);
This is the first adjustment you need if you want to keep the code.
Secondly, you need to look not at the space, but at the necessary dividers. In the .Net String class, you not only have a method IndexOf
that takes one character to search for, you also have a method IndexOfAny
that can search for a set of characters and returns the position of the first match. For example:
var chars = new [] { 'r', 'o', 'v' };
int startIndex = str.IndexOf(substr);
int endIndex = str.IndexOfAny(chars, startIndex + substr.Length);
pos: 0123456789012345678901234567890123456789012
str: The quick brown fox jumps over the lazy dog
sub: fox jumping
^ ?!
This will start looking at the space after fox jumps
(since I added substr.length as before) and will look for any of 'r' 'o' and 'v'. So it will fall into 'o' in 'over'.
You can tune the array chars
to whatever delimiters you would like to find.
You can also use the Regex class to search for a character from a specific set. This example does the same as IndexOfAny above:
var regex = new Regex("[rov]");
int startIndex = str.IndexOf(substr);
var match = regex.Match(str, startIndex + substr.Length);
int endIndex = match.Index;
pos: 0123456789012345678901234567890123456789012
str: The quick brown fox jumps over the lazy dog
sub: fox jumping
^ ?!
The regex will start looking right in the space after fox jumps
(as before) and look for matches for the expression [rov]
(which means: any character such as ROV). So the effect is the same.
You can customize the character set in the regex for whatever delimiters you want to find, just be careful to stick with the Regex syntax. Or you can replace the example expression with whatever formula you want the delimiter to be.
source to share