Regex.Replace is much slower than conditional using String.Contains

I have a list of 400 lines that end with "_GONOGO" or "_ALLOC". When the application starts, I need to disable "_GONOGO" or "_ALLOC" from each of these lines.

I've tried this: 'string blah = Regex.Replace (string, "(_GONOGO | _ALLOC)", ""));'

but it is MUCH slower than a simple conditional statement like this:

if (string.Contains("_GONOGO"))
          // use Substring
else if (string.Contains("_ALLOC"))
          // use Substring w/different index

      

I'm new to regex, so I'm hoping someone has a better solution, or I'm doing something horribly wrong. It doesn't matter, but it would be nice to turn this 4-line conditional into one simple regex line.

+2


source to share


5 answers


As long as it's not RegEx, you can do

string blah = string.Replace("_GONOGO", "").Replace("_ALLOC", "");

      



RegEx is great for complex expressions, but the overhead can sometimes be excessive for very simple operations like this.

+8


source


Regular expression substitutions can work faster if you compile the regular expression first. How in:



Regex exp = new Regex(
    @"(_GONOGO|_ALLOC)",
    RegexOptions.Compiled);

exp.Replace(string, String.Empty);

      

+4


source


Expected; in general, manipulating the string manually will be faster than using a regular expression. Using a regex involves compiling the expression down to the regex tree, and it takes time.

If you use this regex in multiple places, you can use a flag RegexOptions.Compiled

to reduce the overhead for each match, as David describes in his answer. Other regex experts may have tips for improving the expression. However, you might consider sticking with String.Replace; it's fast and straightforward.

+3


source


If they all end up with one of these patterns, it's probably faster to drop the replacement altogether and use:

string result = source.Substring(0, source.LastIndexOf('_'));

      

+1


source


Once you have this information about your problem domain, you can do it quite simply:

const int AllocLength = 6;
const int GonogoLength = 7;
string s = ...;
if (s[s.Length - 1] == 'C')
    s = s.Substring(0, s.Length - AllocLength);
else
    s = s.Substring(0, s.Length - GonogoLength);

      

This is theoretically faster than Abraham's solution , but not as flexible. If the strings have any chance of changing, then it will suffer from maintainability issues that it doesn't have.

+1


source







All Articles