Web page source not available with urllib.urlopen ()
I am trying to get video links from 'https://www.youtube.com/trendsdashboard#loc0=ind'
. When I inspect the elements, it shows me the original html code for each video. In the source code obtained with
urllib2.urlopen("https://www.youtube.com/trendsdashboard#loc0=ind").read()
It doesn't render the html source for the video. Are there other ways to do this?
<a href="/watch?v=dCdvyFkctOo" alt="Flipkart Wish Chain">
<img src="//i.ytimg.com/vi/dCdvyFkctOo/hqdefault.jpg" alt="Flipkart Wish Chain">
</a>
This simple code appears when we inspect elements from the browser, but not in the original code obtained with urllib
source to share
works for me ...
import urllib2
url = 'https://www.youtube.com/trendsdashboard#loc0=ind'
html = urllib.urlopen(url).read()
IMO I would use requests
instead urllib
- it's a little easier to use:
import requests
url = 'https://www.youtube.com/trendsdashboard#loc0=ind'
response = requests.get(url)
html = response.content
Edit
This will give you a list of all the <a></a>
hyperlinked tags as per your changes. I am using BeautifulSoup
html parsing library :
from bs4 import BeautifulSoup
soup = BeautifulSoup(html)
links = [tag for tag in soup.findAll('a') if tag.has_attr('href')]
source to share
To view the source, you need to use the method read
If you just use open, you get something like this.
In [12]: urllib2.urlopen('https://www.youtube.com/trendsdashboard#loc0=ind')
Out[12]: <addinfourl at 3054207052L whose fp = <socket._fileobject object at 0xb60a6f2c>>
To see source usage read
urllib2.urlopen('https://www.youtube.com/trendsdashboard#loc0=ind').read()
source to share
Whenever you compare source code between Python code and web browser, don't do it through insect element, right click on web page and click view source, then you will find the actual source. Inspect Element displays aggregated source code returned by as many network requests as possible, as well as executable javascript code.
Open the open developer console, before opening the webpage, stay on the Networking tab and make sure Save Log is open for Chrome or Persist for Firebug in Firefox, after which you will see all network requests made.
source to share