Using the Regex Plus function in Python to encode and replace
I'm trying to replace something in a string with python and I'm having problems. Here's what I would like to do.
For a given comment in my post:
"here are some great sites that i will do cool things with! https://stackoverflow.com/it a pig & http://google.com"
I would like to use python to create strings like this:
"here are some great sites that i will do cool things with! <a href="http://stackoverflow.com">http%3A//stackoverflow.com</a> & <a href="http://google.com">http%3A//google.com</a>
Here's what I have so far ...
import re
import urllib
def getExpandedURL(url)
encoded_url = urllib.quote(url)
return "<a href=\"<a href="+url+"\">"+encoded_url+"</a>"
text = '<text from above>'
url_pattern = re.compile('(http.+?[^ ]+', re.I | re.S | re.M)
url_iterator = url_pattern.finditer(text)
for matched_url in url_iterator:
getExpandedURL(matched_url.groups(1)[0])
But this is where I am stuck. I have seen things like this before: Regular expressions, but for writing in a match , but surely there must be a better way than repeating each match and doing the clause to replace with them. The difficulty here is that this is not a direct replacement, but I need to do something specific with each match before replacing it.
source to share
I think you want url_pattern.sub(getExpandedURL, text)
.
re.sub (pattern, repl, string, count = 0)
Return the string obtained by replacing the left-most non-overlapping occurrences of the pattern in the string with repl. repl can be either a string or callable; if callable, it passed a match object and should return a replacement string to be used.
source to share