Parsing XML using PHP - including ampersands and other characters
I am trying to parse an XML file and one of the fields looks like this:
<link>http://foo.com/this-platform/scripts/click.php?var_a=a&var_b=b&varc=http%3A%2F%2Fwww.foo.com%2Fthis-section-here%2Fperf%2F229408%3Fvalue%3D0222%26some_variable%3Dmeee</link>
This seems to break the parser. I think it might be related to and in the link?
My code is pretty simple:
<?
$xml = simplexml_load_file("files/this.xml");
echo $xml->getName() . "<br />";
foreach($xml->children() as $child) {
echo $child->getName() . ": " . $child . "<br />";
}
?>
any ideas how i can solve this?
source to share
Your XML feed is not valid XML: &
must be escaped as&
This means you cannot use an XML parser: - (
A possible "solution" (seems to be wrong, but should work) would be to replace " &
" that are not part of the object " &
" to get the correct XML string before loading the XML parser.
In your case, given this:
$str = <<<STR
<xml>
<link>http://foo.com/this-platform/scripts/click.php?var_a=a&var_b=b&varc=http%3A%2F%2Fwww.foo.com%2Fthis-section-here%2Fperf%2F229408%3Fvalue%3D0222%26some_variable%3Dmeee</link>
</xml>
STR;
You can use a simple call str_replace
like:
$str = str_replace('&', '&', $str);
And then parse the string (now XML-valid) that is in $str
:
$xml = simplexml_load_string($str);
var_dump($xml);
In this case, it should work ...
But keep in mind that you must take care of the entities: if you already have an object of type < >
', you should not replace it with' &gt;
'!
This means that such a simple call is str_replace
not the right solution: it will probably break stuff in many XML feeds!
It's up to you to figure out the correct way to do this replacement - perhaps with some kind of regex ...
source to share
I think this will help you http://www.php.net/manual/en/simplexml.examples-errors.php#96218
source to share