Perl Parse XML file over HTTP with a few extra lines

I am trying to write a script that can collect information from an XML file from a remote server. The remote server requires authentication. I was able to authenticate as it uses basic authentication, but it seems that I cannot parse the data due to all the lines before the XML file. Is there a way to avoid getting all these lines and parsing the XML file correctly?

code

#! /usr/bin/perl

use LWP::UserAgent;
use HTTP::Request::Common;

use XML::Simple;

$ua = LWP::UserAgent->new;

$req = HTTP::Request->new(GET => 'https://192.168.1.10/getxml?/home/');
$ua->ssl_opts(SSL_verify_mode => SSL_VERIFY_NONE); #Used to ignore certificate
$req->authorization_basic('admin', 'test');
$test = $ua->request($req)->as_string;

print $test;
# create object
my $xml = new XML::Simple;

# read XML file
my $data = $xml->XMLin("$test");

# access XML data
print $data->{status}[0]{productID};

      

answer

HTTP/1.1 200 OK
Connection: close
Date: Wed, 24 Sep 2014 01:12:20 GMT
Server: 
Content-Length: 252
Content-Type: text/xml; charset=UTF-8
Client-Date: Wed, 24 Sep 2014 01:11:59 GMT
Client-Peer: 192.168.1.10:443
Client-Response-Num: 1
Client-SSL-Cert-Issuer: XXXXXXXXXXXX
Client-SSL-Cert-Subject: XXXXXXXXXXXXX
Client-SSL-Cipher: XXXXXXXXXXXX
Client-SSL-Socket-Class: IO::Socket::SSL

<?xml version="1.0"?>
<Status>
  <SystemUnit item="1">
    <ProductId item="1">TEST SYSTEM</ProductId>
  </SystemUnit>
</Status>
:1: parser error : Start tag expected, '<' not found
HTTP/1.1 200 OK

      

+3


source to share


2 answers


The call $test = $ua->request($req)->as_string;

returns a string representation of the entire request (headers plus content).

Change this to $test = $ua->request($req)->content;

.



This will only return the content, minus the title.

+4


source


I would find a match for the first <and get the rest of the data from there. This will skip the first items that don't interest you. The code will look like this:

$test =~ m/(<.*)/s;
my $xmlData = $1;
my $data = $xml->XMLin("$xmlData");
# Fix the print to get the item for which I believe you are trying to obtain
print $data->{SystemUnit}{ProductId}{content}."\n";

      



where we are fixing <and whatever follows with the s-modifier to specify elements should be treated as one character line (mostly to ignore newlines). $ 1 is the captured data from the match statement that I assigned to a variable if you want to print it or view it in the debugger Also, I added the following to get "TEST SYSTEM" as the content of the ProductId tag.

0


source







All Articles