Finding an XML parser

I was tasked with finding an open source DOM XML parser. The parser should at least support XPath 1.0. Schema support required, but not a transaction breaker

The files we will break down will be small, so speed and memory consumption are not a big issue.

Any OO language (C ++, C #, Java, etc.).

To clarify, the plan is to integrate an XML parser into an application that is much more restrictive than can be done with an external parser. We are creating a responsive XML based object model (change XML, change object model). For this we need to integrate the parser at a fairly low level. This leads to a level of elegance that needs to be experienced to be understood (thanks Mr. Yoder). Some of that elegance disappears if we don't have the ability to navigate this object model through XPath.

We have created a prototype that uses the operating system parser. It worked very well but suffers from complexity and performance issues. But hey, that was the prototype. Now I want to do the real thing and I can write a parser from scratch. (I did this part and it was easy.) Now, the XPath engine is a whole different story. I'm pretty sure I won't be able to do this over the weekend.

0


source to share


5 answers


The always excellent Jaxen can be helpful here . It is a Java XPath implementation used for both JDom and Dom4J.

When refactoring common functionality to traverse the two DOM implementations, you now have an XPath engine that can query any tree model. You only need to write what they call a navigator, which is relatively easy to write.

From the FAQ :



How can I maintain a different object model?

The only thing that is required is an implementation of the org.jaxen.Navigator interface. Not everything requires an interface, but a default implementation in the form of org.jaxen.DefaultNavigator is also provided.

Since many of the XPath axes can be defined in terms of each other (for example, the parent axis is simply the parent being recursively applied), only a few lower-level axis iterators are required to start. Of course, you can implement them directly and not rely on jaxen's composition ability.

I found the entry relatively quickly.

+1


source


To answer this question, I think you need to provide a little more context. Having said that, I found that the new object model (XElement, etc.) for Xml in .NET 3.5 supporting Linq to XML makes navigating XML much easier, and I really mean order, easier and better than using DOM



0


source


If you allow C # then won't the standard C # libraries be available? Are they not enough?

The same for java? And it all started with C ++. I don't understand the flaw.

Googling for the "XPATH XML parser" finds lots of hits to CPAN, JDOM and J2SE, cocoa, MSXML, etc.

Are you just starting your search here, or are the standard answers not enough?

EDIT:

Your clarifications tell me you don't want to use it, do you want to use source to start your own custom XPATH module in your own XML syntax? It's right? And you don't care about the language because all you need is design, not code?

0


source


If you only need design logic and not code, you can explore the Ruby REXML library. It's OO and pretty good and has full XPath support.

MRI has an implementation in C and Ruby. JRuby has a Java implementation.

0


source


Probably a long shot, but jQuery seems to support the XPath syntax for referencing the DOM; and I think its source code is available.

0


source







All Articles