Parser combinators for trees instead of strings

Suppose I need a parser to handle tree input (for example scala.xml.Elem

) instead of strings. I would like to use the parser combinators from this article . If I linearize the input tree, I can write a parser like this easily.

type Parser[A] = seq: Seq[Elem] => List[(A,Seq[Elem])] 

      

I can add parsers return

, failure

, item

etc. and finally write my parsers on top of them.

Now I am wondering if I can make a parser without linearizing the input tree. Is it possible?

+3


source to share


1 answer


Great question. It is absolutely possible to do this, and I have been looking for a tool that does this for a while.

I think the key is that your tree structure will be reflected in your primitive combinators. For example, a primitive parser is item

bound to a container type []

and provides the ability to sequentially traverse a list by first / rest. return

and failure

are independent of the container type []

, so they don't need to be changed to support tree parsing.



You will need to replace it with one or more combinators that will allow you to traverse the tree. I would suggest that you want one combinator to allow you to navigate between siblings (i.e. children of the same parent node, at the same depth) and a second combinator so you can dive deeper into the tree.

What I'm not sure about is whether you need duplicate combinators to match patterns of sequence, interleaving, lookahead, etc. Having to implement each of these two options can get pretty frustrating.

+2


source







All Articles