How do I wrap the entire list of matches from HXT with a datatype constructor?

I am currently studying HXT

using it for GPX analysis . An example is here . I have the following:

import Data.Time
import Text.XML.HXT.Core

data Gpx    = Gpx [Trk]           deriving (Show)
data Trk    = Trk [TrkSeg]        deriving (Show)
data TrkSeg = TrkSeg [TrkPt]      deriving (Show)
data TrkPt  = TrkPt Double Double deriving (Show)

parseGpx =
  getChildren >>> isElem >>> hasName "gpx" >>>
  getChildren >>> isElem >>> hasName "trk" >>>
  parseGpxTrk >>> arr Gpx

parseGpxTrk = undefined
parseGpxTrkSegs = undefined

      

You can see that it is incomplete, but it should still be type-checking. Unfortunately, I already have the error:

Couldn't match type ‘Trk’ with ‘[Trk]’
Expected type: Trk -> Gpx
  Actual type: [Trk] -> Gpx
In the first argument of ‘arr’, namely ‘Gpx’
In the second argument of ‘(>>>)’, namely ‘arr Gpx’

      

This error says that I am trying to pass every matched item from the arrow parseGpxTrk

through the constructor arr Gpx

, but I really want to pass the entire list of matches using the constructor arr Gpx

.

So how do I get HXT

(or arrows in general?) To pass matches as a list through my constructor arr Gpx

instead of passing every entry in the list through the constructor arr Gpx

?

+3


source to share


1 answer


Here is a solution that seems very good to me

{-# LANGUAGE Arrows #-}

import Data.Maybe
import Text.Read
import Text.XML.HXT.Core
import Control.Applicative

data Gpx    = Gpx [Trk]           deriving (Show)
data Trk    = Trk [TrkSeg]        deriving (Show)
data TrkSeg = TrkSeg [TrkPt]      deriving (Show)
data TrkPt  = TrkPt Double Double deriving (Show)

      

The hardest option is probably parseTrkPt

because for it to work properly, you need to handle the parsing String

before Double

, which might fail. I made the decision to return it Maybe TrkPt

and then process it further down the line:

elemsNamed :: ArrowXml cat => String -> cat XmlTree XmlTree
elemsNamed name = isElem >>> hasName name

parseTrkPt :: ArrowXml cat => cat XmlTree (Maybe TrkPt)
parseTrkPt = elemsNamed "trkpt" >>>
    proc trkpt -> do
        lat <- getAttrValue "lat" -< trkpt
        lon <- getAttrValue "lon" -< trkpt
        returnA -< TrkPt <$> readMaybe lat <*> readMaybe lon

      

I also used the syntax proc

because I think it comes out much cleaner. TrkPt <$> readMaybe lat <*> readMaybe lon

is of type Maybe TrkPt

and will return Nothing

if any of the readMaybe

returns Nothing

. Now we can combine all the successful results:

parseTrkSeg :: (ArrowXml cat, ArrowList cat) => cat XmlTree TrkSeg
parseTrkSeg =
    elemsNamed "trkseg" >>>
    (getChildren >>> parseTrkPt >>. catMaybes) >. TrkSeg

      

The parentheses are important here, it took me a while to figure this part out. Depending on where you place the parens you will get different results, for example [TrkSeg [TrkPt a b], TrkSeg [TrkPt c d]]

instead of [TrkSeg [TrkPt a b, TrkPt c d]]

. Similar patterns just follow alongside parsers:



parseTrk :: ArrowXml cat => cat XmlTree Trk
parseTrk =
    elemsNamed "trk" >>>
    (getChildren >>> parseTrkSeg) >. Trk

parseGpx :: ArrowXml cat => cat XmlTree Gpx
parseGpx =
    elemsNamed "gpx" >>>
    (getChildren >>> parseTrk) >. Gpx

      

Then you can start it quite simply, although you still have to expand the root element:

main :: IO ()
main = do
    gpxs <- runX $ readDocument [withRemoveWS yes] "ana.gpx"
                >>> getChildren
                >>> parseGpx
    -- Pretty print the document
    forM_ gpxs $ \(Gpx trks) -> do
        putStrLn "GPX:"
        forM_ trks $ \(Trk segs) -> do
            putStrLn "\tTRK:"
            forM_ segs $ \(TrkSeg pts) -> do
                putStrLn "\t\tSEG:"
                forM_ pts $ \pt -> do
                    putStr "\t\t\t"
                    print pt

      

The trick is to use methods on a class ArrowList

, especially >.

one that has a type a b c -> ([c] -> d) -> a b d

. It concatenates the elements from ArrowList

, passes it to a function that converts it to a new type, then infers a new one ArrowList

to that new type d

.

If you want, you can even abstract this a bit for the last three parsers:

nestedListParser :: ArrowXml cat => String -> cat XmlTree a -> ([a] -> b) -> cat XmlTree b
nestedListParser name subparser constructor
    =   elemsNamed name
    >>> (getChildren >>> subparser)
    >.  constructor

parseTrkSeg :: (ArrowXml cat, ArrowList cat) => cat XmlTree TrkSeg
parseTrkSeg = nestedListParser "trkseg" (parseTrkPt >>. catMaybes) TrkSeg

parseTrk :: ArrowXml cat => cat XmlTree Trk
parseTrk = nestedListParser "trk" parseTrkSeg Trk

parseGpx :: ArrowXml cat => cat XmlTree Gpx
parseGpx = nestedListParser "gpx" parseTrk Gpx

      

This can come in handy if you want to complete the rest of the grammar of the GPX file.

+1


source







All Articles