Parsing Optional Groups
I am trying to create a regex string that pulls data from report files. The tricky part is that I need this single regex string to match multiple report content formats. I want the regex to always match even if some additional groups are not found.
Download the contents of the following report files ( Note : # 2 is missing the "val2" part.):
- File # 1: " -val1-test-val2-result-val3-done - "
- Expected Result:
- Val1 group: test
- Val2 group: result
- Val3 group: done
- File # 2: " -val1-test-val3-done - "
- Expected Result:
- Val1 group: test
- Val2 group: (empty)
- Val3 group: done
I tried the following regex lines:
Regex #1(Normal): "-val1-(?<val1>.+?)-val2-(?<val2>.+?)-val3-(?<val3>.+?)-"
Problem : File # 1 works fine, but in file # 2 the regex doesn't match, so I have no group values.
Regex #2(Non greedy)): "-val1-(?<val1>.+?)(-val2-(?<val2>.+?))?-val3-(?<val3>.+?)-"
Regex #3(Boolean OR): "-val1-(?<val1>.+?)(-val2-(?<val2>.+?)|(.*?))-val3-(?<val3>.+?)-"
Regex #4(Conditionnal): "-val1-(?<val1>.+?)(?(-val2-(?<val2>.+?))|(.+?))-val3-(?<val3>.+?)-"
Regex #5(Conditionnal): "-val1-(?<val1>.+?)(?(-val2-(?<val2>.+?))(-val2-(?<val2>.+?)))-val3-(?<val3>.+?)-"
Regex #6(Conditionnal): "-val1-(?<val1>.+?)(?(-val2-(?<val2>.+?))(-val2-(?<val2>.+?))|(.+?))-val3-(?<val3>.+?)-"
Problem : File # 2 works as expected, but group val2 of file # 1 is always empty.
Conclusion . The behavior seems to be that even if the optional group is present, the regex will prioritize the empty group value over the current value. Is there a way to force the complementary groups to get the value when they are present and return (empty) when they are not?
Note . I am using the latest .NET framework and the code will be ported to Java (Android). I try to avoid using multiple operations for performance and bandwidth issues.
Can anyone help me with this?
source to share