How to create ENTITY links in DOCTYPE using perl / LibXML
I am trying to create the following DTD declarations containing entities:
<!DOCTYPE LinkSet PUBLIC "-//NLM//DTD LinkOut 1.0//EN" "https://www.ncbi.nlm.nih.gov/projects/linkout/doc/LinkOut.dtd"
[ <!ENTITY icon.url "https://example.com/icon.png">
<!ENTITY base.url "https://example.com/content/" > ]>
I can successfully create a DOCTYPE without entity references:
#!/usr/bin/perl -w
use strict;
use XML::LibXML;
my $doc = XML::LibXML::Document->new('1.0','UTF-8');
my $dtd = $doc->createInternalSubset( "LinkSet", "-//NLM//DTD LinkOut 1.0//EN", "https://www.ncbi.nlm.nih.gov/projects/linkout/doc/LinkOut.dtd" );
my $ls = $doc->createElement( "LinkSet" );
$doc->setDocumentElement($ls);
print $doc->toString;
exit;
Results in:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE LinkSet PUBLIC "-//NLM//DTD LinkOut 1.0//EN" "https://www.ncbi.nlm.nih.gov/projects/linkout/doc/LinkOut.dtd">
<LinkSet/>
The XML :: LibXML documentation shows how to add a document reference for a document, but not how to declare an object in a DOCTYPE.
A similar (but PHP based) question points to creating ENTITY links as a string and parsing that. Is this the best approach in Perl too?
source to share
The documentation for XML::LibXML::Document
says this
[Document class] inherits all functionality from
XML::LibXML::Node
as specified in the DOM Specification. This provides access to nodes other than the document-level root element â for example, "DTD". Support for these nodes is limited at this time.
It also turns out that the source of these restrictions is libxml2
not the Perl module. This makes sense because the DTD has a completely different syntax from XML (or even XML processing instructions), although it may look similar in appearance.
The only way is to parse the underlying document with the required DTD and work with that
Thus
use strict;
use warnings 'all';
use XML::LibXML;
my $doc = XML::LibXML->load_xml(string => <<__END_XML__);
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<!DOCTYPE LinkSet PUBLIC "-//NLM//DTD LinkOut 1.0//EN" "https://www.ncbi.nlm.nih.gov/projects/linkout/doc/LinkOut.dtd"
[
<!ENTITY icon.url "https://example.com/icon.png">
<!ENTITY base.url "https://example.com/content/">
]>
<LinkSet/>
__END_XML__
print $doc;
Output
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE LinkSet PUBLIC "-//NLM//DTD LinkOut 1.0//EN" "https://www.ncbi.nlm.nih.gov/projects/linkout/doc/LinkOut.dtd" [
<!ENTITY icon.url "https://example.com/icon.png">
<!ENTITY base.url "https://example.com/content/">
]>
<LinkSet/>
source to share