PHP SimpleXML change line break characters in CDATA elements

I am using PHP version 5.3.9. I am using SimpleXML because it changes the line endings in CDATA sections when parsing xml files.

For example:

$string = "<value><![CDATA[hello\r\nworld]]></value>";

$xml = simplexml_load_string($string);
var_dump((string)$xml);

var_dump("hello\r\nworld");

      

Outputs:

string(11) "hello world"
string(12) "hello world"

      

Without even looking at the hex values, you can immediately see that the syntax version of SimpleXML changes the newline from the newline Windows \\\\\\\\\\\\\\\\\\\\\\\ \\\\\\\\\\\\\\\\\ This is a problem because I would like to store serialize()

ed objects in my XML file, but serialize()

stores the exact length of any lines it serializes, including newlines. When I try unserialize()

lines after reading in XML, the line length is no longer correct due to the changed line ending and it cannot be undone correctly. It is possible to work around this by making sure I sanitize every input line to replace "\ r \ n" with "\ n", but that doesn't seem like what I need to do.

I was under the impression that the XML parsers should not parse the contents of the CDATA elements in any way. I don't understand how the CDATA sections are specified, am I somehow using SimpleXML incorrectly or is this a bug in SimpleXML?

+3


source to share


1 answer


I didn't get it.

but just note that you used double quotes.

in my version:

$string = '<value><![CDATA[hello\r\nworld]]></value>';

$xml = simplexml_load_string($string);
var_dump($xml->__toString());
var_dump((string)$xml);

$xml = new SimpleXMLElement($string);
var_dump($xml->__toString());

var_dump('hello\r\nworld');

      



outputs

string(14) "hello\r\nworld"
string(14) "hello\r\nworld"
string(14) "hello\r\nworld"
string(14) "hello\r\nworld"

      

but what is your expectation? just tell us what is it supposed to be?

-2


source







All Articles