Dom4j vs JAXB for reading and updating large and complex XML files
I have an XML file with a stable tree structure and over 5000 elements.
Part of it is below:
<Companies>
<Offices>
<RevenueInfo>
<TransactionId>14042015014606877</TransactionId>
<Company>
<Identification>
<GlobalId>25142400905</GlobalId>
<BranchId>373287734</BranchId>
<GeoId>874</GeoId>
<LastUpdated>2015-04-14T01:46:06.940</LastUpdated>
<RecordType>7785</RecordType>
</Identification>
<Info>
<DataEntry>
<EntryId>12345</EntryId>
</DataEntry>
<DataEntry>
<EntryId>34567</EntryId>
</DataEntry>
<DataEntry>
<EntryId>89076</EntryId>
</DataEntry>
<DataEntry>
<EntryId>13211</EntryId>
</DataEntry>
</Info>
...more elements
</Company>
</RevenueInfo>
</Offices>
</Companies>
I need to be able to update any values ββin the document based on user input and create a new XML file with the updated information. The user will pass in the BranchId, the name of the item being updated, and its order number if multiple items are encountered (for example, for EntryId 12345, the user will pass 373287734 EntryId=1 010101
)
I've looked at JAXB, but it seems to me like to create model classes for this kind of XML, but it looks like it would make printing to file and the location of the element to update much easier.
Dom4j has good results too, but not sure what the parsing would look like.
My question is, is JAXB the best approach in this case, or can you suggest a better way to parse this type of XML?
source to share
In my experience JAXB only works well when the schema is simple and stable. In other cases, you are better off using a generic tree model. The main generic models in the Java world are DOM, JDOM2, DOM4J, XOM, AXIOM. JDOM2 and XOM are my preferences; DOM4J strikes me as overly complicated and somewhat old-fashioned. But that depends on what you are looking for.
But then the application you are describing looks like a perfect candidate for end-to-end XML or XRX - XForms, XSLT, XQuery, XProc. You don't need Java at all.
source to share
Due to not meeting performance and memory requirements, I would recommend trying XPath alongside DOM4J (or JDOM or even plain DOM). To select a company, you can use an XPath expression like this:
"//Company[Identification/BranchId = '373287734']"
Then, using the returned company element as context, you can get the element to update with a different XPath expression:
"//EntryId[position() = 1]"
source to share