Screen Cleaning Tips: Interactive Graph

I recently read some tutorials on how to use BeautifulSoup with Python and learned how to simply strip text and urls from web pages. Now I am trying to clear the data from the following link,

http://www.study.cam.ac.uk/undergraduate/apply/statistics/

There is a graphical generator built in at the bottom of the page and I would like to clear all data from it without spending a lot of time stroking these values ​​from all possible graphs. I tried using my paltry newbies, but it's not obvious to me where in the HTML the graph data appears - furthermore, the HTML seems to be dynamic depending on where my mouse is on the screen.

The question is: is it possible to clear this data with these tools, and if so, how?

+3


source to share


1 answer


Using the browser developer tools, you can see Show Graph

there is a request when the button is clicked POST

, which will be http://www.study.cam.ac.uk/undergraduate/apply/statistics/data.php . The result is an object JSON

containing all the data needed to plot the graph.

Simulate this request in Python, for example with requests

module:



import requests

URL = "http://www.study.cam.ac.uk/undergraduate/apply/statistics/data.php"
HEADERS = {'X-Requested-With': 'XMLHttpRequest'}

data = {
    'when': 'year',
    'year': 2014,
    'applications': 'on',
    'offers': 'on',
    'acceptances': 'on',
    'groupby': 'college',
    'for-5-years-what': 'university'
}

response = requests.post(URL, data=data, headers=HEADERS)
print response.json()

      

No need BeautifulSoup

here. At least from what I understood from your question.

+4


source







All Articles