Python RegEx matches every element
I am trying to create a regex for the following line:
[1,null,"7. Mai 2017"],[2,"test","8. Mai 2018"],[3,"test","9. Mai 2019"]
I'm trying to get as a match the output of each parenthesis with its content as one element, e.g .:
[1,null,"7. Mai 2017"]
[2,"test","8. Mai 2018"]
[3,"test","9. Mai 2019"]
My initial naive approach was something like this:
(\[[^d],.+\])+
However, the rule. + is too general and ends up matching the entire string. Any hints?
source to share
I'm not sure about the data format you are trying to parse and where it comes from, but it looks like JSON-like. For this particular line, adding square brackets at the beginning and end of the line makes it loadable JSON:
In [1]: data = '[1,null,"7. Mai 2017"],[2,"test","8. Mai 2018"],[3,"test","9. Mai 2019"]'
In [2]: import json
In [3]: json.loads("[" + data + "]")
Out[3]:
[[1, None, u'7. Mai 2017'],
[2, u'test', u'8. Mai 2018'],
[3, u'test', u'9. Mai 2019']]
Note what null
becomes Python None
.
source to share
The following code will output what you requested using \[[^]]*]
.
import re
regex = r'\[[^]]*]'
line = '[1,null,"7. Mai 2017"],[2,"test","8. Mai 2018"],[3,"test","9. Mai 2019"]'
row = re.findall(regex, line)
print(row)
Output:
['[1, null, "7. Mai 2017"]', '[2, "test", "8. Mai 2018"]', '[3, "test", "9. Mai 2019"]]
Consider changing null
to None
as it matches the python view.
source to share
You can consider the wonderful pyparsing module to do this:
import pyparsing
for match in pyparsing.originalTextFor(pyparsing.nestedExpr('[',']')).searchString(exp):
print match[0]
[1,null,"7. Mai 2017"]
[2,"test","8. Mai 2018"]
[3,"test","9. Mai 2019"]
(If it's not actually JSON - use a JSON module, if so ...)
source to share