Screen scraping technique using php

How to screen a specific website. I need to login to the site and then clear the internal information. How can I do that?

Please help me.

Duplicate: How do I implement a web scraper in PHP?

+1


source to share


6 answers


You want to look at curl functions - they will allow you to get a page from another site. You can use cookies or HTTP authentication to log in and then get the page you want, depending on which site you are logging into.



Once you have the page, you are probably best off using regular expressions to clean up the data you want.

0


source


Zend_Http_Client and Zend_Dom_Query

      



+1


source


You should look at the curl.

0


source


You can also take a look at BeautifulSoup , which is a Python library that is supposed to be very good at generating bad HTML parsing. It targets things like screen cleaning.

How easy it would be to call from PHP I don't know though.

0


source


You can also check http://php.net/dom

0


source


Curl, and once you get in, use the PHP QueryPath library. (Querypath.org) You can access dom elements like in JQuery with CSS selectors, there is chaining method ...

Better than just using php native PHP functions.

It also works like a drupal extension, but I assume you can implement it in any php project.

0


source







All Articles