My background: I don't really understand forall or when Haskell has them implicitly.
Okay, consider the type of id
, a -> a
. What does a
mean, and where does it come from? When you define a value, you can't just use arbitrary variables that aren't defined anywhere. You need a top-level definition, or a function argument, or a where
clause, &c. In general, if you use a variable, it must be bound somewhere.
The same is true of type variables, and forall
is one such way to bind a type variable. Anywhere you see a type variable that isn't explicitly bound (for example, class Foo a where ...
binds a
inside the class definition), it's implicitly bound by a forall
.
So, the type of id
is implicitly forall a. a -> a
. What does this mean? Pretty much what it says. We can get a type a -> a
for all possible types a
, or from another perspective, if you pick any specific type you can get a type representing "functions from your chosen type to itself". The latter phrasing should sound a bit like defining a function, and as such you can think of forall
as being similar to a lambda abstraction for types.
GHC uses various intermediate representations during compilation, and one of the transformations it applies is making the similarity to functions more direct: implicit forall
s are made explicit, and anywhere a polymorphic value is used for a specific type, it is first applied to a type argument.
We can even write both forall
s and lambdas as one expression. I'll abuse notation for a moment and replace forall a.
with /\a =>
for visual consistency. In this style, we can define id = /\a => \(x::a) -> (x::a)
or something similar. So, an expression like id True
in your code would end up translated to something like id Bool True
instead; just id True
would no longer even make sense.
Just as you can reorder function arguments, you can likewise reorder the type arguments, subject only to the (rather obvious) restriction that type arguments must come before any value arguments of that type. Since implicit forall
s are always the outermost layer, GHC could potentially choose any order it wanted when making them explicit. Under normal circumstances, this obviously doesn't matter.
I'm not sure exactly what's going on in this case, but based on the comments I would guess that the conversion to using explicit type arguments and the desugaring of do
notation are, in some sense, not aware of each other, and therefore the order of type arguments is specified explicitly to ensure consistency. After all, if something is blindly applying two type arguments to an expression, it matters a great deal whether that expression's type is forall a b. m a -> m b -> m b
or forall b a. m a -> m b -> m b
!