Desserializing many network messages without using the ad-hoc parser implementation

Question

Desserializing many network messages without using the ad-hoc parser implementation

I have a question related to deserialization. I can imagine a solution using Data.Data, Data.Typeable, or using GHC.Generics, but I'm curious if this can be done without generics, SYB, or meta-programming.

Description of the problem:

Given the [String] list, which is known to contain fields of a locally defined algebraic data type, I would like to deserialize [String] to construct the target data type. I could write a parser for this, but I'm looking for a generic solution that will deserialize to an arbitrary number of data types defined in the program, without writing a parser for each type. Aware of the number and type of value constructors that are of algebraic type, it is as simple as doing read on each line to get the appropriate values needed to create the type. However, I don't want to use generics, reflection, SYB, or meta-programming (unless otherwise possible).

Let's say I have about 50 types like this (all simple algebraic types consisting of base primitives (no nested or recursive types, just different combinations and orders of primitives):

data NetworkMsg = NetworkMsg { field1 :: Int, field2 :: Int, field3 :: Double}

data NetworkMsg2 = NetworkMsg2 { field1 :: Double, field2 :: Int, field3 :: Double }

I can determine the type of data to associate with [String] that I got over the net using the tag id that I parse before each [String].

Possible suggested solution:

Since data constructors are first-class values in Haskell, and are actually of type: Can NetworkMsg constructor can be thought of as a function, for example:

NetworkMsg :: Int -> Int -> Double -> NetworkMsg

Can I convert this function to a function on tuples using uncurryN and then copy the [String] into a tuple of the same shape that the function is now executing?

NetworkMsg' :: (Int, Int, Double) -> NetworkMsg

I don't think it will work because I will need knowledge of value constructors and type information, which would require Data.Typeable, reflection or some other metaprogramming method.

Basically, I'm looking for automatic deserialization of many types without writing type instance declarations or parsing the type's shape at runtime. If this is not feasible, I will do it in an alternative way.

+3

haskell deserialization

Robert bermani May 30 '15 at 18:35

source to share

1 answer

shang · Accepted Answer · 2015-05-31T08:22:18+0000

You are correct that constructors are essentially just functions, so you can write generic instances for any number of types by simply writing instances for those functions. You will still have to write a separate instance for all the different arguments.

{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}

import Text.Read
import Control.Applicative

class FieldParser p r where
    parseFields :: p -> [String] -> Maybe r

instance Read a => FieldParser (a -> r) r where
    parseFields con [a] = con <$> readMaybe a
    parseFields _ _ = Nothing

instance (Read a, Read b) => FieldParser (a -> b -> r) r where
    parseFields con [a, b] = con <$> readMaybe a <*> readMaybe b
    parseFields _ _ = Nothing

instance (Read a, Read b, Read c) => FieldParser (a -> b -> c -> r) r where
    parseFields con [a, b, c] = con <$> readMaybe a <*> readMaybe b <*> readMaybe c
    parseFields _ _ = Nothing

{- etc. for as many arguments as you need -}

You can now use this class type to parse any constructor-based message as long as the type checker can determine the received message type from the context (i.e. it cannot infer it simply from the given constructor for these multi-parameter class instance types) ...

data Test1 = Test1 {fieldA :: Int} deriving Show
data Test2 = Test2 {fieldB ::Int, fieldC :: Float} deriving Show

test :: String -> [String] -> IO ()
test tag fields = case tag of
    "Test1" -> case parseFields Test1 fields of
        Just (a :: Test1) -> putStrLn $ "Succesfully parsed " ++ show a
        Nothing -> putStrLn "Parse error"
    "Test2" -> case parseFields Test2 fields of
        Just (a :: Test2) -> putStrLn $ "Succesfully parsed " ++ show a
        Nothing -> putStrLn "Parse error"

I would like to know exactly how you use the message types in the application, though, since having each message as its own separate type makes it very difficult to work with any general message handler.

Is there some reason why you don't have a simple data type for posts? For example,

data NetworkMsg
    = NetworkMsg1 {fieldA :: Int}
    | NetworkMsg2 {fieldB :: Int, fieldC :: Float}

Now that the instances are built in much the same way, you get a much better output type, since the type of the result is always known.

instance Read a => MessageParser (a -> NetworkMsg) where
    parseMsg con [a] = con <$> readMaybe a

instance (Read a, Read b) => MessageParser (a -> b -> NetworkMsg) where
    parseMsg con [a, b] = con <$> readMaybe a <*> readMaybe b

instance (Read a, Read b, Read c) => MessageParser (a -> b -> c -> NetworkMsg) where
    parseMsg con [a, b, c] = con <$> readMaybe a <*> readMaybe b <*> readMaybe c

parseMessage :: String -> [String] -> Maybe NetworkMsg
parseMessage tag fields = case tag of
    "NetworkMsg1" -> parseMsg NetworkMsg1 fields
    "NetworkMsg2" -> parseMsg NetworkMsg2 fields
    _ -> Nothing

I'm also not sure why you want to do generic programming on purpose, without using any generic tools. GHC.Generics, SYB or Template Haskell are usually the best solution for this kind of problem.

Desserializing many network messages without using the ad-hoc parser implementation

More articles: