Python Regex: why isn't Python accepting my template?

Question

Python Regex: why isn't Python accepting my template?

I want to write a Python regex that takes a pattern string:

"u'Johns Place", "

and returns:

Jones Place

It must find the character "u", the apostrophe comes after it, then the apostrophe that precedes the comma, and returns whatever exists between those two apostrophes.

So I wrote the following code:

title = "u'Johns Place',"
print re.sub(r"u'([^\"']*)',", r"\"\1\"", title)

however, I still got the whole line

"u'Johns Place", "

no filtration.

Do you know how this can be solved?

+3

python regex

CrazySynthax Jul 26 17 at 13:50

source to share

3 answers

Eric Duminil · Answer 1 · 2017-07-26T13:54:44+0000

Python doesn't accept your pattern because of the middle '

in "John's"

. It is not followed by a comma as described in your template. Matching cannot continue searching ',

because you only accept characters that are not "

or '

, with [^\"']*

.

If you want to parse JSON from Python use json

rather than regexen applied to unicode escaped strings.

Erik Perčić · Answer 2 · 2017-07-26T13:56:40+0000

I don't use Python much, but this regex should solve your problem.

^u'(.*)',$

match u and a single quote from the beginning, commit anything after that until you end with a single comma and a comma

print re.sub(r"^u'(.*)',$", r"\"\1\"", title)

remove ^ and $ if your string has more than replaced (in other words if there is any context)

CrazySynthax · Answer 3 · 2017-07-26T15:34:23+0000

After doing more research I found this package https://simplejson.readthedocs.io/en/latest/

This can force you to read the JSON file without putting u '..' for each line.

import simplejson as json
import requests

response_json = requests.get(<url-address>)
current_json = json.loads(response_json.content)

current_json will not have a "u" at the beginning of each line.

It partially answers my question because it returns keys and values that are delimited by a single quotation mark ('), rather than quotation marks (") as needed in JSON format.

Python Regex: why isn't Python accepting my template?

More articles: