Python Regex: why isn't Python accepting my template?

I want to write a Python regex that takes a pattern string:

"u'Johns Place", "

and returns:

Jones Place

It must find the character "u", the apostrophe comes after it, then the apostrophe that precedes the comma, and returns whatever exists between those two apostrophes.

So I wrote the following code:

title = "u'Johns Place',"
print re.sub(r"u'([^\"']*)',", r"\"\1\"", title)

      

however, I still got the whole line

"u'Johns Place", "

no filtration.

Do you know how this can be solved?

+3


source to share


3 answers


Python doesn't accept your pattern because of the middle '

in "John's"

. It is not followed by a comma as described in your template. Matching cannot continue searching ',

because you only accept characters that are not "

or '

, with [^\"']*

.



If you want to parse JSON from Python use json

rather than regexen applied to unicode escaped strings.

+7


source


I don't use Python much, but this regex should solve your problem.

^u'(.*)',$

      

match u and a single quote from the beginning, commit anything after that until you end with a single comma and a comma



print re.sub(r"^u'(.*)',$", r"\"\1\"", title)

      

remove ^ and $ if your string has more than replaced (in other words if there is any context)

+2


source


After doing more research I found this package https://simplejson.readthedocs.io/en/latest/

This can force you to read the JSON file without putting u '..' for each line.

import simplejson as json
import requests

response_json = requests.get(<url-address>)
current_json = json.loads(response_json.content)

      

current_json will not have a "u" at the beginning of each line.

It partially answers my question because it returns keys and values ​​that are delimited by a single quotation mark ('), rather than quotation marks (") as needed in JSON format.

0


source







All Articles