Why are GHC and GHCI different in withdrawal type?

I noticed that when doing the codegolf challenge that by default GHC does not infer the most common type of variables, which leads to type errors when trying to use it with two different types.

For example:

(!) = elem
x = 'l' ! "hello" -- From its use here, GHC assumes (!) :: Char -> [Char] -> Bool
y = 5 ! [3..8] -- Fails because GHC expects these numbers to be of type Char, too

      

This can be changed using pragma NoMonomorphismRestriction

.

However, putting this into GHCI does not throw a type error, but :t (!)

shows what it takes here (Foldable t, Eq a) => a -> t a -> Bool

, even if explicitly executed with -XMonomorphismRestriction

.

Why are GHC and GHCI different, suggesting the most common type of function?

(Also, why is it enabled by default? What does it help?)

+3


source to share


3 answers


The background to why the committee made this decision is given in the designers' own words in The Haskell Story: Being Lazy with the Class by Paul Hoodak et al.

The main source of controversy in the early stages was the so-called "restriction of monomorphism". Let's assume it genericLength

has this overloaded type:

genericLength :: Num a => [b] -> a

      

Now consider this definition:

f xs = (len, len)`
  where
    len = genericLength xs

      

It looks like it len

should only be calculated once, but it may actually be calculated twice. What for? Because we can infer the type len :: (Num a) => a;

when unsetting using the dictionary translation, len

it becomes a function that is called once for each occurrence len

, each of which can be used in a different type.

[John] Hughes has categorically stated that it is unacceptable silently thus duplication of computation. His argument was motivated by a program he wrote that was exponentially slower than he expected. (This was, admittedly, a very simple compiler, but we were reluctant to make the performance differences as large as it depends on the compiler optimizations.)

After much discussion, the committee accepted the now-famous restriction of monomorphism. Briefly, it says that this definition does not look like a function (that is, it has no arguments in relation to the left-hand side) must be monomorphic in any overloaded type. This example uses the force rule len

to be used on the same type on both occurrences, which solves the performance issue. The programmer can provide an explicit type signature for len

polymorphic behavior if desired.

The limitation of monomorphism is clearly a wart on the tongue. It seems to bite every new Haskell programmer by generating an unexpected or obscure error message. discussion of alternatives.



(18, emphasis added). Please note that John Hughes is a co-author of the article.

+5


source


I cannot reproduce your result that GHCi is injecting the type (Foldable t, Eq a) => a -> t a -> Bool

even with -XMonomorphismRestriction

(GHC 8.0.2).

I see that when I enter the string (!) = elem

, it specifies the type (!) :: () -> [()] -> Bool

, which is actually a great illustration of why you want GHCi to behave "differently" from GHC, given that GHC uses the monomorphism constraint.

The issue described by @Davislor says that the restriction on monomorphism was meant so that you can write code that syntactically looks like it computes a value once, binding it to a name, and then using it multiple times. associated with a name is a reference to a closure that waits for a type dictionary before it can actually compute a value. All sites used will separately design which dictionary they need to pass and compute the value again, even if all sites used actually select the same dictionary (just like if you write a number function and then call it from several different places with with the same parameter, you will get the same result calculated several times). But if the user thought of this binding as a simple value,this would be unexpected, and it is very likely that all developer sites will need one dictionary (because the user was expecting a reference to a single value computed from one dictionary).

The monomorphism constraint forces GHC not to infer types that still need a dictionary (for bindings that don't have strong syntax parameters). So now the dictionary is fetched once on the binding site, rather than separately each time the binding is used, and the value is actually calculated only once. But this only works if the dictionary selected on the anchor site is the correct one that all sites in use would select. If the GHC chose the wrong bindings on a site, then all sites used will be type errors, even if they all agree on what type (and therefore vocabulary) they expected.

GHC collects all modules at once. In this way, he can see the sites of use and the site of binding at the same time . Thus, if any use of a binding requires a specific concrete type, the binding will use a dictionary of that type, and it will be fine if all other sites of use are compatible with that type (even if they were actually polymorphic and also worked with other types). This works even if the code that infers the correct type is widely separated from the binding by many other calls; all the constraints on the types of things are effectively linked by unification during the type checking / inference phase, so when the compiler selects a type on the binding site, it can "see" the requirements from all user sites (within the same module).

But if the sites of use do not all agree with one particular type, then you will get a type error like in your example. One user site (!)

requires a type variable to be instantiated a

as Char

, another requires a type that also has an instance Num

(which is Char

not).

This was not consistent with our robust assumption that all developer sites need a single dictionary, so the monomorphism constraint resulted in an error that could have been avoided by inferring a more general type for (!)

. Undoubtedly limiting monomorphism prevents more problems than it solves, but given that it, we certainly want GHCi to behave the same, right?



However, GHCi is an interpreter. You enter one code at a time at a time, not one module at a time. So when you type (!) = elem

and hit enter, GHCi needs to understand that operator and call the value to bind to (!)

with a specific type right now (it might be an invaluable thunk, but we need to know what its type is). With the restriction of monomorphism, we cannot infer (Foldable t, Eq a) => a -> t a -> Bool

, we have to choose a type for these type variables now , without information from user sites to help us choose something reasonable. The default extended rules that are included in GHCi (another difference from GHC) are by default equal []

and ()

, so you get (!) :: () -> [()] -> Bool

1... Quite useless and you get a type error trying to use any of your examples.

The problem that monomorphism constraint addresses are especially egregious in the case of numerical computation, when you don't write explicit tick signatures. Since Haskell numeric literals are overloaded, you can easily write an entire complex calculation complete with initial data, the most common type of which is polymorphic with constraints Num

or Floating

or etc. Most of the built-in numeric types are very small, so you are more likely to have values ​​that you would rather store than evaluate multiple times. The scenario is more likely and more likely to be a problem.

But also with numeric types that the type I / O process is " essential ) for standard type variables for a particular type in a way that is generally applicable (and small examples with numbers are exactly what people new to Haskell can try in the interpreter) Before the monomorphism limitation was disabled by default in GHCi, there was a constant stream of Haskell question here on Stack Overflow from confused people why they couldn't split numbers in GHCi that they could compile in code, or something- something like this (basically the other way around your question here) In compiled code you can basically just write the code the way you want with no explicit types, and the output of the full module dictates whether your default integer literals should be used Integer

orInt

if they need to be added to something is returned length

or Double

if they need to be added to something and multiplied by something else that is somewhere else separated by something, etc. etc. In GHCi, simple is x = 2

very often wrong when restricting monomorphism (because it will pick Integer

whatever you wanted to do with x

later), as a result, you have to add a lot more type annotations to a fast and simple interactive interpreter than even the hottest Explicit typist would use in production compiled code.

So, of course, it is debatable whether GHC should use the monomorphism constraint or not; it is intended to solve a real problem, it causes some other 2 as well . But limiting monomorphism is a terrible idea for a translator. The fundamental difference between inference in turns and modulo means that even when both of them used it by default, they behaved differently in practice anyway. GHCi without the restriction of monomorphism is at least much more applicable.


1 Without the extended default rules, you'd instead get an ambiguous type variable error because it has nothing to discard, not even the not-too-silly default rules.

2 I find this only mildly annoying in actual development because I write signature types for top-level bindings. I believe that it is enough that the restriction of monomorphism is only rarely applied, so that it does not help or hinder me. As such, I would most likely prefer it disposed of so that everything works consistently, especially since it seems to bite students far more often than bite me as a practitioner. On the other hand, debugging a rare performance problem when it matters is much more difficult than rarely having to add the correct type signature, which GHC won't do annoyingly.

+3


source


NoMonomorphismRestriction

is a useful default in GHCI because you don't have to write out so many dastardly types of signatures in repl. GHCI will try to infer the most general types it can do.

MonomorphismRestriction

is useful by default otherwise for efficiency / performance reasons. Specifically, the problem boils down to:

typeclasses essentially introduce additional functional parameters - specifically, a dictionary of code that implements the instances in question. In the case of polymorphic template mappings like typeclass, you turn something like a template binding β€” a constant that will only ever be evaluated once β€” into what is actually a function binding that won't be stored in memory.

Link

0


source







All Articles