java - Java 中的 Hindley-Milner 算法

Question

我正在开发一个用 Java 编写的简单的基于数据流的系统（想象它像一个 LabView 编辑器/运行时）。用户可以在编辑器中将块连接在一起，我需要类型推断以确保数据流图是正确的，但是，大多数类型推断示例都是用数学符号、ML、Scala、Perl 等编写的，我不会“说” ”。

我阅读了有关 Hindley-Milner 算法的信息，并发现这个文档有一个我可以实现的很好的例子。它适用于一组 T1 = T2 类似的约束。但是，我的数据流图转换为 T1 >= T2 类约束（或 T2 扩展 T1，或协方差，或 T1 <: T2，正如我在各种文章中看到的那样）。没有 lambda 只是类型变量（用于通用函数，如T merge(T in1, T in2)）和具体类型。

回顾 HM 算法：

Type = {TypeVariable, ConcreteType}
TypeRelation = {LeftType, RightType}
Substitution = {OldType, NewType}
TypeRelations = set of TypeRelation
Substitutions = set of Substitution

1) Initialize TypeRelations to the constraints, Initialize Substitutions to empty
2) Take a TypeRelation
3) If LeftType and RightType are both TypeVariables or are concrete 
      types with LeftType <: RightType Then do nothing
4) If only LeftType is a TypeVariable Then
    replace all occurrences of RightType in TypeRelations and Substitutions
    put LeftType, RightType into Substitutions
5) If only RightType is a TypeVariable then
    replace all occurrences of LeftType in TypeRelations and Substitutions
    put RightType, LeftType into Substitutions
6) Else fail

如何更改原始 HM 算法以使用这些关系而不是简单的相等关系？Java-ish 示例或解释将不胜感激。

score 13 · Accepted Answer

我阅读了至少 20 篇文章并找到了一篇我可以使用的文章（Francois Pottier：存在子类型的类型推断：从理论到实践）：

输入：

Type = { TypeVariable, ConcreteType }
TypeRelation = { Left: Type, Right: Type }
TypeRelations = Deque<TypeRelation>

辅助功能：

ExtendsOrEquals = #(ConcreteType, ConcreteType) => Boolean
Union = #(ConcreteType, ConcreteType) => ConcreteType | fail
Intersection = #(ConcreteType, ConcreteType) => ConcreteType
SubC = #(Type, Type) => List<TypeRelation>

如果第一个扩展或等于第二个，ExtendsOrEquals 可以判断两种具体类型，例如，(String, Object) == true, (Object, String) == false。

如果可能，联合计算两个具体类型的公共子类型，例如，(Object, Serializable) == Object&Serializable, (Integer, String) == null。

Intersection 计算两个具体类型的最接近的超类型，例如，(List, Set) == Collection, (Integer, String) == Object。

SubC 是结构分解函数，在这个简单的例子中，它只会返回一个包含其参数的新 TypeRelation 的单例列表。

跟踪结构：

UpperBounds = Map<TypeVariable, Set<Type>>
LowerBounds = Map<TypeVariable, Set<Type>>
Reflexives = List<TypeRelation>

UpperBounds 跟踪可能是类型变量的超类型的类型，LowerBounds 跟踪可能是类型变量的子类型的类型。Reflexives 跟踪对类型变量之间的关系，以帮助算法的边界重写。

算法如下：

While TypeRelations is not empty, take a relation rel

  [Case 1] If rel is (left: TypeVariable, right: TypeVariable) and 
           Reflexives does not have an entry with (left, right) {

    found1 = false;
    found2 = false
    for each ab in Reflexives
      // apply a >= b, b >= c then a >= c rule
      if (ab.right == rel.left)
        found1 = true
        add (ab.left, rel.right) to Reflexives
        union and set upper bounds of ab.left 
          with upper bounds of rel.right

      if (ab.left == rel.right)
        found2 = true
        add (rel.left, ab.right) to Reflexives
        intersect and set lower bounds of ab.right 
          with lower bounds of rel.left

    if !found1
        union and set upper bounds of rel.left
          with upper bounds of rel.right
    if !found2
        intersect and set lower bounds of rel.right
          with lower bounds of rel.left

    add TypeRelation(rel.left, rel.right) to Reflexives

    for each lb in LowerBounds of rel.left
      for each ub in UpperBounds of rel.right
        add all SubC(lb, ub) to TypeRelations
  }
  [Case 2] If rel is (left: TypeVariable, right: ConcreteType) and 
      UpperBound of rel.left does not contain rel.right {
    found = false
    for each ab in Reflexives
      if (ab.right == rel.left)
        found = true
        union and set upper bounds of ab.left with rel.right
    if !found 
        union the upper bounds of rel.left with rel.right
    for each lb in LowerBounds of rel.left
      add all SubC(lb, rel.right) to TypeRelations
  }
  [Case 3] If rel is (left: ConcreteType, right: TypeVariable) and
      LowerBound of rel.right does not contain rel.left {
    found = false;
    for each ab in Reflexives
      if (ab.left == rel.right)
         found = true;
         intersect and set lower bounds of ab.right with rel.left
    if !found
       intersect and set lower bounds of rel.right with rel.left
    for each ub in UpperBounds of rel.right
       add each SubC(rel.left, ub) to TypeRelations
  }
  [Case 4] if rel is (left: ConcreteType, Right: ConcreteType) and 
      !ExtendsOrEquals(rel.left, rel.right)
    fail
  }

一个基本的例子：

Merge = (T, T) => T
Sink = U => Void

Sink(Merge("String", 1))

这个表达式的关系：

String >= T
Integer >= T
T >= U

1.) rel 是 (String, T); 案例 3 被激活。因为 Reflexives 是空的，所以 T 的 LowerBounds 设置为 String。不存在 T 的 UpperBounds，因此，TypeRelations 保持不变。

2.) rel 是 (Integer, T); 案例 3 再次被激活。Reflexives 仍然为空，T 的下限设置为 String 和 Integer 的交集，产生 Object，T 仍然没有上限，TypeRelations 没有变化

3.) rel 是 T >= U。案例 1 被激活。因为 Reflexives 是空的，所以 T 的 Upper Bounds 与 U 的 Upper bounds 结合在一起，U 仍然是空的。然后将下界 U 设置为下界 ot T，产生 Object >= U。TypeRelation(T, U) 被添加到 Reflexives。

4.) 算法终止。从边界 Object >= T 和 Object >= U

在另一个示例中，演示了类型冲突：

Merge = (T, T) => T
Sink = Integer => Void

Sink(Merge("String", 1))

关系：

String >= T
Integer >= T
T >= Integer

步骤 1.) 和 2.) 与上述相同。

3.) rel 是 T >= U。情况 2 被激活。该案例试图将 T 的上限（此时为 Object）与 Integer 联合，但失败且算法失败。

类型系统的扩展

向类型系统添加泛型类型需要在主要情况和 SubC 函数中进行扩展。

Type = { TypeVariable, ConcreteType, ParametricType<Type,...>)

一些想法：

如果 ConcreteType 和 ParametricType 相遇，那就是错误。
如果 TypeVariable 和 ParametricType 满足，例如 T = C(U1,...,Un) 则创建新的 Type 变量和关系为 T1 >= U1, ... , Tn >= Un 并使用它们。
如果两个 ParametricType 满足（D<> 和 C<>）检查是否 D >= C 并且类型参数的数量相同，然后将每一对提取为关系。

score 4 · Accepted Answer

Hindley-Milner 算法本质上是一种统一算法，即一种用于求解具有变量的图方程的图同构的算法。

Hindley-Milner 并不直接适用于您的问题，但 Google 搜索发现了一些线索；例如“Pragmatic Subtyping in Polymorphic Languages”，它说“我们提出了基于名称不等价的 Hindley/Milner 类型系统的子类型扩展......”。（我没读过。）

...但是，大多数类型推断示例都是用数学符号、ML、Scala、Perl 等编写的，我不会“说”这些。

我认为你必须自己克服这个障碍。类型理论和类型检查基本上是数学的……而且很困难。你需要努力学习语言。

java - Java 中的 Hindley-Milner 算法

2 回答 2

Related

Reference