Regex that identifies sections in COBOL
I am setting up an outline bracket plugin that uses a regex to define the outline of the currently open file.
Using regex101.com, I created the following regular expression (uses lookarounds to determine that the string starts with seven spaces and ends with "SECTION."):
(?<=^ )([A-Za-z\-0-9]*)(?= SECTION\.[ ]*$)
According to regex101.com this is fine, however when checked through jshint / jslint it indicates it is invalid. When I test it, it doesn't work (I suspect JSHint / JSLint is correct).
Below is an example of some cobol code where I want to get 2000-GET-EXPECTED-IN-DATE and 2020-GET-DUE-DATE.
...
2000-GET-EXPECTED-BY-DATE SECTION.
MOVE '2' TO W10-OPTION.
...
ELSE
MOVE 'Y' TO W10-NO-ERRORS
END-IF.
2017-EXIT.
EXIT.
/
2020-GET-DUE-DATE SECTION.
2020.
MOVE 'N' TO W10-USER-INPUT-DUE-DATE-SW.
MOVE '1' TO W10-OPTION.
...
So my questions are:
- Is the regex valid?
- If this is not true, then how was I wrong?
- How do I write a regex to find the name of each section?
source to share
This works for me to find lines with "SECTION":
^[ ]{7}(.*)[ ]SECTION\.$
DEMO: http://regex101.com/r/zC1xY6/2
If you only want the section names: ^[ ]{7}\d+\-(.*)[ ]SECTION\.$
source to share
Ok, it turns out that what I was using works, with two comments:
- Must add global and multiline modifier when used via regex101.com,
- Runs very slowly, so regex101.com plays out in a large program.
However, I found (via the regex101.com experiment) that if I change it to
^ (.*)(?= SECTION\.[ ]*$)
Then it works with no timeout problem. It looks like if I use ^[ ]{7}
as a prefix or I use ([A-Za-z0-9-]*)
as a capture group to match the name, then it is very slow.
The main issue is performance (.*)
compared to ([A-Za-z0-9-]*)
, and later much slower.
I can use the look and feel of regex101.com: (?<=^ )(.*)(?= SECTION\.[ ]*$)
however it throws an error with JSLint / JSHint. Therefore, I will not use it.
I tested the first one ^ (.*)(?= SECTION\.[ ]*$)
in the fork of the parenthesis outlines program and it works! :-)
source to share