Parsing XML and Converting to PHP?
I have a custom XML schema defined to render a page that puts elements on the page by evaluating the XML elements on the page. This is currently implemented using preg regex functions, most notably the excellent preg_replace_callback function, for example:
...
$s = preg_replace_callback("!<field>(.*?)</field>!", replace_field, $s);
...
function replace_field($groups) {
return isset($fields[$group[1]) ? $fields[$groups[1]] : "";
}
As an example.
Now this works pretty well ... as long as the XML elements are not nested. It gets a lot more complicated at this point, for example if you have:
<field name="outer">
<field name="inner">
...
</field>
</field>
First, you want you to replace the innermost field first. Reasonable use of greedy / unwanted regex patterns might go somehow to handle these more complex scenarios, but a clear message that I am reaching the limits of what a regex can reasonably do and really need to parse XML ...
What I need is an XML transform package that:
allows me to conditionally evaluate / include the contained document tree or not based on a callback function ideally (similar to preg_replace_callback); can handle nested elements of one or more types; and also handles attributes in a nice way (like an associative array, for example).
What can help me along the way?
source to share
The PHP XSLTProcessor
class ( ext / xsl - PHP 5 includes the default XSL extension and can be enabled by adding an argument --with-xsl[=DIR]
to your config string) is quite complex and allows, among other things, PHP functions in your XSL document using XSLTProcessor::registerPHPFunctions()
.
The following example is shamelessly squeezed into the PHP manual page :
$xml = '<allusers>
<user>
<uid>bob</uid>
</user>
<user>
<uid>joe</uid>
</user>
</allusers>';
$xsl = '<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:php="http://php.net/xsl">
<xsl:output method="html" encoding="utf-8" indent="yes"/>
<xsl:template match="allusers">
<html><body>
<h2>Users</h2>
<table>
<xsl:for-each select="user">
<tr><td>
<xsl:value-of
select="php:function(\'ucfirst\',string(uid))"/>
</td></tr>
</xsl:for-each>
</table>
</body></html>
</xsl:template>
</xsl:stylesheet>';
$xmldoc = DOMDocument::loadXML($xml);
$xsldoc = DOMDocument::loadXML($xsl);
$proc = new XSLTProcessor();
$proc->registerPHPFunctions();
$proc->importStyleSheet($xsldoc);
echo $proc->transformToXML($xmldoc);
source to share
You can use XSL for this - match the internal templates first.
Here's a good starting point for learning what you can do with XSL:
You can do xsl transform server or on client (using js, activex or others).
If you still hate this xsl idea, you can take a look at the xml parsing built into PHP - google for the PHP SAX parser, which is a callback implementation to create your custom parser currently using libxml2.
source to share
Definitely not regular expressions. XML formats can be modified in ways that do not affect their content (in other words: invisible to XML processing libraries), but are important for regular expressions. This kind of code quickly becomes a maintenance nightmare.
As for using the parser (SAX, StAX, DOM, JDOM, dom4j, XOM, etc.)
source to share