How to use beautifulsoup to get html redirect?
I am watching a web file with the following title. How can I get the content of google.com page using bs4?
<head>
<meta http-equiv="refresh" content="5;url=http://google.com"/>
</head>
Thank!
0
user3388884
source
to share
1 answer
Use find
with the tag name meta
, but attrs
with a known fixed attribute, namely http-equiv
must have a value refresh
. Get the first such element from the result set and take the value of its attribute 'content'
, then parse it for the URL.
Thus, you get:
>>> fragment = """<head><meta http-equiv="refresh" content="5;url=http://google.com"/></head>"""
>>> soup = BeautifulSoup(fragment)
>>> element = soup.find('meta', attrs={'http-equiv': 'refresh'})
>>> element
<meta content="5;url=http://google.com" http-equiv="refresh"/>
>>> refresh_content = element['content']
>>> refresh_content
u'5;url=http://google.com'
>>> url = refresh_content.partition('=')[2]
>>> url
u'http://google.com'
+1
Antti haapala
source
to share