But it ch...">

Parsing multiple XML snippets with STaX

I was hoping there would be parsing in StAX,

<something a="b"/>
<something a="b"/>

      

But it chokes when you reach the second element. Since there is no common root element. (I'm not too sure why the parsing parser cares about this particular issue ... anyway ...)

I can spoof the root element for example. Guava:

    InputSupplier<Reader> join = CharStreams.join(
            newReaderSupplier("<root>"),
            newReaderSupplier(new File("...")),
            newReaderSupplier("</root>"));

    XMLInputFactory xif = XMLInputFactory.newInstance();
    XMLStreamReader xsr = xif.createXMLStreamReader(join.getInput());
    xsr.nextTag();  // Skip the fake root

      

So my question is, is there a way to avoid this hack? Some kind of "fragment" mode that I can turn on the parser?

+3


source to share


3 answers


Woodstox's StAX implementation seems to support this: http://woodstox.codehaus.org/3.2.9/javadoc/com/ctc/wstx/api/WstxInputProperties.html#P_INPUT_PARSING_MODE



Anyway, we already use Woodstox in some places, but I didn't think about Google using special options for Woodstox!

+1


source


Nope. The StAX API does not support fragments. A XMLStreamReader

is suitable for a single XML document. However, your "hack" is not so bad ...



+2


source


According to the XML specification, an XML document must have a single root element, or else it is not well-formed. So your so-called hack is not a hack at all, it is the best way to fix the document ....

+1


source







All Articles