Accessing href value with HTML :: TreeBuilder :: XPath
I am using LWP::UserAgent
,
HTML::Selector::XPath
and
HTML::TreeBuilder::XPath
to get the attribute value of the href
first YouTube video in a search result set.
My code so far:
use LWP::UserAgent;
use HTML::TreeBuilder::XPath;
use HTML::Selector::XPath;
my $ua = LWP::UserAgent->new;
#my $response =..
my $html = "http://www.youtube.com/results?search_query=run+flo+rida";
my $tree = HTML::TreeBuilder::XPath->new;
my $xpath = HTML::Selector::XPath::selector_to_xpath("(//*[@id = 'search-results']/li)[1]/div[2]/h3/a/@href/");
my @nodes = $tree->findnodes($xpath);
print" $nodes[0]";
I'm not sure if my print is wrong if some other syntax is wrong. At the moment it is printing
HTML::TreeBuilder::XPath=HASH(0x1a78250)
when i search for it to print
/watch?v=JP68g3SYObU
Thanks for any help!
source to share
There are several problems here.
-
You should always
use strict
anduse warnings
at the top of every Perl program. It will catch a lot of errors that you can easily miss and is polite when you ask for help with your code. In this case, you should have warned you that your XPath string contains array variable names@id
and@href
that you might not need to interpolate into a string. -
You are using
HTML::Selector::XPath
that translates a CSS selector into an XPath expression. But you supply this XPath expression, so it won't work and no module is needed. -
No need to use
LWP
at all as itHTML::TreeBuilder
has a constructornew_from_url
that will fetch the HTML page for you.
This program seems to do what you want it to. I also added a module URI
to get the absolute url from the relative value of the attribute href
.
use strict;
use warnings;
use HTML::TreeBuilder::XPath;
use URI;
my $url = "http://www.youtube.com/results?search_query=run+flo+rida";
my $tree = HTML::TreeBuilder::XPath->new_from_url($url);
my $anchor = $tree->findnodes('//ol[@id="search-results"]//h3[@class="yt-lockup2-title"]/a/@href');
my $href = URI->new_abs($anchor->[0]->getValue, $url);
print $href;
Output
http://www.youtube.com/watch?v=JP68g3SYObU
source to share