Refactoring makes notation in an applicative style
So I was working on a simple expression solver in Haskell. I'm trying to refactor some of my code from notation to application code, mainly because I want to learn how applicators work. I do not know how to do that.
factor :: Parser Expr
factor = do
char '('
x <- buildExpr
char ')'
return x
<|> number
<|> variables
<?> "simple expression"
What would be a way to do this in an applicative style? I tried the following but it will not introduce validation
factor = pure buildExpr <$> (char '(' *> buildExpr *> char ')')
where buildExper is of type Parser Expr.
source to share
The short answer is:
factor = (char '(' *> buildExpr <* char ')') <|> number <|> variables
<?> "simple expression"
Long answer:
<$>
has this type:
(<$>) :: (Functor f) => (a -> b) -> f a -> f b
In other words, it takes a function and a value of an instance type Functor
(and returns something we don't care about right now). Unfortunately, you don't give it a function as the first argument; you give it pure buildExpr
which is Parser
which, when executed, neither consumes input nor gives buildExpr
. If you really wanted to do this, you could with <*>
:
factor = pure buildExpr <$> (char '(' *> buildExpr *> char ')')
This will run pure buildExpr
, extract this function from it, and then run it on the result (char '(' *> buildExpr *> char ')')
. But, unfortunately, we cannot do this: buildExpr
is Parser
, not a function.
If you think about it enough, the thought should pass through your mind: why do we mention it buildExpr
twice if we only want to take it apart? It turns out that it is enough to mention this only once. In fact, it probably does almost what you want:
factor = char '(' *> buildExpr *> char ')'
Only the problem: it will give Char
)
, not the result buildExpr
. Darn! But by looking at the documentation and comparing the types, you will eventually be able to understand that if you replace the second *>
with <*
, it all works the way you want:
factor = char '(' *> buildExpr <* char ')'
A good mnemonic is that the arrow points to the value you want to store. We don't care about parentheses here, so the arrow points; but we want to store the result buildExpr
, so the arrows point inward towards it.
source to share
All of these operators are left associative; <
and / or >
points to things that contribute meaning; it $
for things-to-left-is-pure-value and *
for left-application-driven computation.
My rule of thumb for using these operators is as follows. First, list the components of grammatical production and classify them as "signal" or "noise" depending on whether they contribute semantically important information. Here we have
char '(' -- noise
buildExpr -- signal
char ')' -- noise
Then figure out what a "semantic function" is, which takes the values of the signal components and gives the meaning for the entire production. Here we have
id -- pure semantic function, then a bunch of component parsers
char '(' -- noise
buildExpr -- signal
char ')' -- noise
Now every component of the parser has to be bound to what happens in front of it using the operator, but which?
- always start with
<
- next
$
for the first component (as a pure function immediately before) or*
for every other component - then comes
>
if the component is a signal or
if it is marked
So what gives us
id -- pure semantic function, then a bunch of parsers
<$ char '(' -- first, noise
<*> buildExpr -- later, signal
<* char ')' -- later, noise
If the semantic function id
is like here, you can get rid of it and use it *>
to glue noise to the edge of the signal that is the argument id
. I usually prefer not to do this, just so that I can clearly see the semantic function at the start of production. Plus, you can create a choice between such pieces by interleaving <|>
, and you don't need to wrap them in parentheses.
source to share