How to use beautifulsoup to get html redirect?

I am watching a web file with the following title. How can I get the content of google.com page using bs4?

<head>
<meta http-equiv="refresh" content="5;url=http://google.com"/>  
</head>

      

Thank!

0


source to share


1 answer


Use find

with the tag name meta

, but attrs

with a known fixed attribute, namely http-equiv

must have a value refresh

. Get the first such element from the result set and take the value of its attribute 'content'

, then parse it for the URL.

Thus, you get:



>>> fragment = """<head><meta http-equiv="refresh" content="5;url=http://google.com"/></head>"""
>>> soup = BeautifulSoup(fragment)
>>> element = soup.find('meta', attrs={'http-equiv': 'refresh'})
>>> element
<meta content="5;url=http://google.com" http-equiv="refresh"/>

>>> refresh_content = element['content']
>>> refresh_content
u'5;url=http://google.com'

>>> url = refresh_content.partition('=')[2]
>>> url
u'http://google.com'

      

+1


source







All Articles