OCaml 编译器有一个“-principal”选项,邮件列表中有时会提到术语“principal type”。它到底是什么意思?维基百科中的定义是递归的,因为它假设读者已经熟悉这个概念。
2 回答
类型推断的过程是猜测,给定一个用户编写的程序,这个程序的类型是什么。一般来说,一个给定的程序可能有几种正确的类型。例如,程序fun x -> x
可以被赋予类型int -> int
和bool -> bool
。
给定一个程序,如果该程序的类型是可以赋予该程序的最通用类型,则该程序的类型是主体,从某种意义上说,所有其他可能的类型都是该类型的特化(实例)。在我的fun x -> x
示例中,多态'a -> 'a
是一种主要类型。
在某些类型系统中,主体类型并不总是存在。你有一个程序P
有两种可能的类型T1
和T2
,没有一个比另一个更通用。例如,在一些数值运算符被重载的系统中,程序fun x -> x + x
可以被赋予 typeint -> int
和 type float -> float
,并且没有包含两者的类型。这对推理引擎来说是个问题,因为这意味着它必须做出任意选择,在不知道它是否是用户想要的类型的情况下选择一种可能的类型。如果您有主体类型,则推理过程不需要做出任何选择。
(要解决该示例,您可以:(1)不重载算术运算符(2)做出任意选择(这就是 F# iirc 所做的)(3)拒绝程序并要求类型注释以消除歧义(4)有更多表现力类型,例如 Haskell 的Num a => a -> a
.)
OCaml 语言的“简单”子集基于 Hindley-Milner 类型推断,具有主要类型。这意味着推理引擎总是做正确的事情(给定可能类型的规范)。类型系统的一些更高级的方面(例如多态字段和方法)失去了这个属性:在某些情况下,类型系统无法找到最通用的类型,或者找到最通用的类型需要从类型推理引擎(通常会尝试快速)。-principal
如果我没记错的话,这个选项是一个旋钮:
- 失败是在某些情况下,类型检查器本来会接受非主要解决方案(做出任意选择)
- 更努力地找到一个主要的解决方案,代价是更长的类型检查时间和内存使用
我对这个标志不是很熟悉(我更喜欢避免过于高级的类型系统功能,所以我的程序通常不关心),所以你必须仔细检查这个,但这是粗略的想法。这个标志在我看来相对不重要(你通常不需要关心),但主体类型的概念确实是 ML 语言理论的重要组成部分。
如果您想更进一步,还有两个技术细节:
The ML notion of "principal types" is the question of whether there exist a most general type given a fixed type environment; some authors study the question of whether they exist a most general (environment, type) pair, so-called a "principal typing"; it is a harder question (you have to infer what you expect from the other modules, while in ML you are given signatures for the external modules you rely on; and inferring polymorphism is very hard) that is not used in most ML-inspired programming languages.
Existence of principal types is a delicate equilibrium for designers of type system. If you remove features from the ML type system (polymorphism, types such as
'a -> 'a
), you lose principality, but if you add power (going from ML to System F which has more expressive polymorphic types) you may also lose principality. You can regain that lost principality by moving to even more complex type systems such as MLF, but it is a hard problem.
In practice, a relatively large part of the designers of programming languages have lately given up on the idea of principality. They want to have more ambitious type systems (dependent types, etc.) where it's just too hard to seek principality, so they instead content themselves with non-principal inference: it's already good if the inference engine can find some type, let's not be difficult on the generality of the result. Jacques Garrigue, the main maintainer of the OCaml type system, still cares about it very much, and I think that's an interesting aspect of the OCaml programming language research.
To build a little bit upon gasche's explanation, here's an example, stolen from OCaml's own testsuite, where principality, or lack thereof, appears. Disclaimer: this uses objects.
class c = object method id : 'a. 'a -> 'a = fun x -> x end;;
type u = c option;;
let just = function None -> failwith "just" | Some x -> x;;
let f x = let l = [Some x; (None : u)] in (just(List.hd l))#id;;
let g x =
let none = (fun y -> ignore [y;(None:u)]; y) None in
let x = List.hd [Some x; none] in (just x)#id;;
let h x =
let none = let y = None in ignore [y;(None:u)]; y in
let x = List.hd [Some x; none] in (just x)#id;;
When running this in a top-level session, you'll get:
# val f : c -> 'a -> 'a = <fun>
# val g : c -> 'a -> 'a = <fun>
# val h : < id : 'a; .. > -> 'a = <fun>
These functions perform the exact same thing but get assigned different types!
If you invoke OCaml with the -principal
option, the first two examples will trigger:
Warning 18: this use of a polymorphic method is not principal.
The interesting thing is that if you replace 'a
in the type of h
with 'a -> 'a
, you get the exact same type as f
and g
, since the type c
is just a type abbreviation for (i.e. expands to) < id: 'a -> 'a; .. >
.
What the compiler wants to tell the programmer here is that there are two suitable types for f
and g
, and that there is no criterion that would help OCaml decide which one is the "best" (and by "best", I mean "principal", of course!).