How can I make an HTTP GET request from Perl?

I am trying to write my first Perl program. If you think Perl is a bad language for this task, tell me which language will do it better.

The program checks the connection between this computer and the remote Apache server. First, the program requests a list of directories from the Apache server, than it parses the list and downloads all files one by one. If a problem occurs with a file (the connection is dropped before the specified Content-Length is reached) this should be logged and the next file should be restored. There is no need to save files or even check integrity, I only need to log the time it takes to complete and all the times the connection is dropped.

To get a list of links from the index of a directory created by Apache, I plan on using a regexp similar to

/href=\"([^\"]+)\"/

      

Regexp is not being debugged yet.

What is the "referential" way of making an HTTP request with Perl? I googled and found examples using many different libraries, some of them commercial. I need something that can detect outages (timeout or TCP reset) and handle them.

Another question. How do I save whatever falls under my regex when searching around the world as a list of strings with minimal coding effort?

+2


source to share


4 answers


As far as describing the whole problem, I would use WWW :: Mechanize . A mechanism is a subclass LWP::UserAgent

that adds stateful behavior and HTML parsing. With mech you can just do $mech->get($url_of_index_page)

and then use $mech->find_all_links(criteria)

to select the following links.



+10


source


You have many questions in one. The answer to the question in the title of your post is to use LWP :: Simple .



Most of your other questions on perlfaq9 have relevant pointers for more information.

+9


source


Regarding parsing markup with regex in part of your question, DO NOT NEED!

http://htmlparsing.icenine.ca explains some of the reasons why you shouldn't. While what you are trying to parse seems simple, use the correct parser.

The page linked above no longer exists ...

http://www.cwhitener.com/htmlparsing

+4


source


As a more general answer, Perl is a great language for HTTP requests, as well as many other languages. If you're familiar with Perl, feel free to; there are many great libraries available to do what you need.

+3


source







All Articles