Parsing multiple XML snippets with STaX

Question

Parsing multiple XML snippets with STaX

I was hoping there would be parsing in StAX,

<something a="b"/>
<something a="b"/>

But it chokes when you reach the second element. Since there is no common root element. (I'm not too sure why the parsing parser cares about this particular issue ... anyway ...)

I can spoof the root element for example. Guava:

    InputSupplier<Reader> join = CharStreams.join(
            newReaderSupplier("<root>"),
            newReaderSupplier(new File("...")),
            newReaderSupplier("</root>"));

    XMLInputFactory xif = XMLInputFactory.newInstance();
    XMLStreamReader xsr = xif.createXMLStreamReader(join.getInput());
    xsr.nextTag();  // Skip the fake root

So my question is, is there a way to avoid this hack? Some kind of "fragment" mode that I can turn on the parser?

+3

java xml xml-parsing stax

Iain 27 Mar At 11:20 pm

source to share

3 answers

Nope. The StAX API does not support fragments. A XMLStreamReader

is suitable for a single XML document. However, your "hack" is not so bad ...

+2

chris Mar 30 12 at 8:14

source to share

According to the XML specification, an XML document must have a single root element, or else it is not well-formed. So your so-called hack is not a hack at all, it is the best way to fix the document ....

+1

vtd-xml-author Apr 18 16 at 23:02

source to share

Iain · Accepted Answer · 2012-04-04T23:51:07+0000

Woodstox's StAX implementation seems to support this: http://woodstox.codehaus.org/3.2.9/javadoc/com/ctc/wstx/api/WstxInputProperties.html#P_INPUT_PARSING_MODE

Anyway, we already use Woodstox in some places, but I didn't think about Google using special options for Woodstox!

Parsing multiple XML snippets with STaX

More articles: