Using the Regex Plus function in Python to encode and replace

I'm trying to replace something in a string with python and I'm having problems. Here's what I would like to do.

For a given comment in my post:

"here are some great sites that i will do cool things with! https://stackoverflow.com/it a pig & http://google.com"

      

I would like to use python to create strings like this:

"here are some great sites that i will do cool things with! <a href="http://stackoverflow.com">http%3A//stackoverflow.com</a> &amp; <a href="http://google.com">http%3A//google.com</a> 

      

Here's what I have so far ...

import re
import urllib

def getExpandedURL(url)
    encoded_url = urllib.quote(url)
    return "<a href=\"<a href="+url+"\">"+encoded_url+"</a>"

text = '<text from above>'
url_pattern = re.compile('(http.+?[^ ]+', re.I | re.S | re.M)
url_iterator = url_pattern.finditer(text)
for matched_url in url_iterator:
    getExpandedURL(matched_url.groups(1)[0])

      

But this is where I am stuck. I have seen things like this before: Regular expressions, but for writing in a match , but surely there must be a better way than repeating each match and doing the clause to replace with them. The difficulty here is that this is not a direct replacement, but I need to do something specific with each match before replacing it.

0


source to share


1 answer


I think you want url_pattern.sub(getExpandedURL, text)

.



re.sub (pattern, repl, string, count = 0)

Return the string obtained by replacing the left-most non-overlapping occurrences of the pattern in the string with repl. repl can be either a string or callable; if callable, it passed a match object and should return a replacement string to be used.

+3


source







All Articles