Python Regex: why isn't Python accepting my template?
I want to write a Python regex that takes a pattern string:
"u'Johns Place", "
and returns:
Jones Place
It must find the character "u", the apostrophe comes after it, then the apostrophe that precedes the comma, and returns whatever exists between those two apostrophes.
So I wrote the following code:
title = "u'Johns Place',"
print re.sub(r"u'([^\"']*)',", r"\"\1\"", title)
however, I still got the whole line
"u'Johns Place", "
no filtration.
Do you know how this can be solved?
source to share
Python doesn't accept your pattern because of the middle '
in "John's"
. It is not followed by a comma as described in your template. Matching cannot continue searching ',
because you only accept characters that are not "
or '
, with [^\"']*
.
If you want to parse JSON from Python use json
rather than regexen applied to unicode escaped strings.
source to share
I don't use Python much, but this regex should solve your problem.
^u'(.*)',$
match u and a single quote from the beginning, commit anything after that until you end with a single comma and a comma
print re.sub(r"^u'(.*)',$", r"\"\1\"", title)
remove ^ and $ if your string has more than replaced (in other words if there is any context)
source to share
After doing more research I found this package https://simplejson.readthedocs.io/en/latest/
This can force you to read the JSON file without putting u '..' for each line.
import simplejson as json
import requests
response_json = requests.get(<url-address>)
current_json = json.loads(response_json.content)
current_json will not have a "u" at the beginning of each line.
It partially answers my question because it returns keys and values that are delimited by a single quotation mark ('), rather than quotation marks (") as needed in JSON format.
source to share