Refactoring makes notation in an applicative style

So I was working on a simple expression solver in Haskell. I'm trying to refactor some of my code from notation to application code, mainly because I want to learn how applicators work. I do not know how to do that.

factor :: Parser Expr
factor = do
    char '('
    x <- buildExpr
    char ')'
    return x
<|> number
<|> variables
<?> "simple expression"

      

What would be a way to do this in an applicative style? I tried the following but it will not introduce validation

factor = pure buildExpr <$> (char '(' *> buildExpr *> char ')')

      

where buildExper is of type Parser Expr.

+3


source to share


2 answers


The short answer is:

factor = (char '(' *> buildExpr <* char ')') <|> number <|> variables
     <?> "simple expression"

      


Long answer:

<$>

has this type:

(<$>) :: (Functor f) => (a -> b) -> f a -> f b

      

In other words, it takes a function and a value of an instance type Functor

(and returns something we don't care about right now). Unfortunately, you don't give it a function as the first argument; you give it pure buildExpr

which is Parser

which, when executed, neither consumes input nor gives buildExpr

. If you really wanted to do this, you could with <*>

:



factor = pure buildExpr <$> (char '(' *> buildExpr *> char ')')

      

This will run pure buildExpr

, extract this function from it, and then run it on the result (char '(' *> buildExpr *> char ')')

. But, unfortunately, we cannot do this: buildExpr

is Parser

, not a function.

If you think about it enough, the thought should pass through your mind: why do we mention it buildExpr

twice if we only want to take it apart? It turns out that it is enough to mention this only once. In fact, it probably does almost what you want:

factor = char '(' *> buildExpr *> char ')'

      

Only the problem: it will give Char

)

, not the result buildExpr

. Darn! But by looking at the documentation and comparing the types, you will eventually be able to understand that if you replace the second *>

with <*

, it all works the way you want:

factor = char '(' *> buildExpr <* char ')'

      

A good mnemonic is that the arrow points to the value you want to store. We don't care about parentheses here, so the arrow points; but we want to store the result buildExpr

, so the arrows point inward towards it.

+6


source


All of these operators are left associative; <

and / or >

points to things that contribute meaning; it $

for things-to-left-is-pure-value and *

for left-application-driven computation.

My rule of thumb for using these operators is as follows. First, list the components of grammatical production and classify them as "signal" or "noise" depending on whether they contribute semantically important information. Here we have

char '('      -- noise
buildExpr     -- signal
char ')'      -- noise

      

Then figure out what a "semantic function" is, which takes the values ​​of the signal components and gives the meaning for the entire production. Here we have

id     -- pure semantic function, then a bunch of component parsers
       char '('      -- noise
       buildExpr     -- signal
       char ')'      -- noise

      



Now every component of the parser has to be bound to what happens in front of it using the operator, but which?

  • always start with <

  • next $

    for the first component (as a pure function immediately before) or *

    for every other component
  • then comes >

    if the component is a signal or

    if it is marked

So what gives us

id     -- pure semantic function, then a bunch of parsers
   <$  char '('      -- first, noise
   <*> buildExpr     -- later, signal
   <*  char ')'      -- later, noise

      

If the semantic function id

is like here, you can get rid of it and use it *>

to glue noise to the edge of the signal that is the argument id

. I usually prefer not to do this, just so that I can clearly see the semantic function at the start of production. Plus, you can create a choice between such pieces by interleaving <|>

, and you don't need to wrap them in parentheses.

+5


source







All Articles