haskell - 推断两条记录中公共字段的类型

Question

如果这是一个愚蠢的问题，请耐心等待。如何键入一个通用函数，该函数接受两条记录并返回其公共字段的数组？

假设我有：

type A = { name :: String, color :: String }
type B = { name :: String, address :: Address, color :: String }

myImaginaryFunction :: ???
-- should return ["name", "color"] :: Array of [name color]

我想编写一个函数，它接受任何两种类型的记录并返回一个公共字段数组。haskell 解决方案也可以。

score 5 · Accepted Answer

要在 Haskell 中用公共字段表示两种记录类型，您需要一个 GHC 扩展：

{-# LANGUAGE DuplicateRecordFields #-}

并自省字段的名称，您需要基于Data类的泛型：

{-# LANGUAGE DeriveDataTypeable #-}
import Data.Data ( Data, Typeable, DataRep(AlgRep), dataTypeRep
                 , dataTypeOf, constrFields)
import Data.List (intersect)
import Data.Proxy (Proxy(..), asProxyTypeOf)

这将允许您使用相同的字段名称定义两种数据类型：

data Address = Address String deriving (Typeable, Data)
data A = A { name :: String, color :: String }
    deriving (Typeable, Data)
data B = B { name :: String, address :: Address, color :: String}
    deriving (Typeable, Data)

然后您可以使用以下方法检索字段名称：

fieldNames :: (Data t) => Proxy t -> [String]
fieldNames t = case dataTypeRep $ dataTypeOf $ asProxyTypeOf undefined t of
  AlgRep [con] -> constrFields con

并通过以下方式获取公共字段：

commonFields :: (Data t1, Data t2) => Proxy t1 -> Proxy t2 -> [String]
commonFields t1 t2 = intersect (fieldNames t1) (fieldNames t2)

之后，以下将起作用：

ghci> commonFields (Proxy :: Proxy A) (Proxy :: Proxy B)
["name", "color"]
ghci>

请注意，fieldNames上面的实现假设只有具有单个构造函数的记录类型将被内省。Data.Data如果您想对其进行概括，请参阅文档。

现在，因为你是一个帮助吸血鬼，我知道你会要求一个类型级函数，即使你在你的问题中没有提到需要一个类型级函数！实际上，我可以看到您已经添加了一条评论，说明您有兴趣以某种方式返回一个数组，name | color尽管 Haskell 中不存在这样的东西，即使您在问题中明确表示您期望术语级 answer ["name", "color"]。

不过，可能有非吸血鬼有类似的问题，也许这个答案会帮助他们。

score 5 · Accepted Answer

对于 Haskell，我喜欢 KA Buhr 的回答，但我个人不会使用 Typeable，而是使用 GHC Generics。我认为这可能是目前的偏好。

对于 PureScript，我在本月早些时候的博客文章Making Diffs of different-typed Records in PureScript 中写到了这类问题。该方法与没有行类型的语言完全不同（不，Elm 没有这些。除了使用同质字符串映射之外，您真的没有解决方案）。

首先，如果您完全熟悉 PureScript，您可能想要使用Union，但这也不起作用，因为您想要执行以下操作：

Union r1' r r1

您的第一r1'条r记录r1和r2. 原因是您在这里有两个未解决的变量，并且 Union 的函数依赖关系需要解决 Union 的三个参数中的任何两个。

因此，由于我们不能直接使用 Union，我们将不得不制定某种解决方案。因为我可以获得一个按键排序的 RowList 结构，所以我选择使用它来遍历两个不同记录的 RowLists 并找出交叉点：

class RowListIntersection
  (xs :: RowList)
  (ys :: RowList)
  (res :: RowList)
  | xs ys -> res

instance rliNilXS :: RowListIntersection Nil (Cons name ty tail) Nil
instance rliNilYS :: RowListIntersection (Cons name ty tail) Nil Nil
instance rliNilNil :: RowListIntersection Nil Nil Nil
instance rliConsCons ::
  ( CompareSymbol xname yname ord
  , Equals ord EQ isEq
  , Equals ord LT isLt
  , Or isEq isLt isEqOrLt
  , If isEq xty trashty yty
  , If isEq xty trashty2 zty
  , If isEq (SProxy xname) trashname (SProxy zname)
  , If isEq
      (RLProxy (Cons zname zty res'))
      (RLProxy res')
      (RLProxy res)
  , If isEqOrLt
      (RLProxy xs)
      (RLProxy (Cons xname xty xs))
      (RLProxy xs')
  , If isLt
      (RLProxy (Cons xname yty ys))
      (RLProxy ys)
      (RLProxy ys')
  , RowListIntersection xs' ys' res'
  ) => RowListIntersection (Cons xname xty xs) (Cons yname yty ys) res

然后我使用了一个简短的定义来获取生成的 RowList 的键：

class Keys (xs :: RowList) where
  keysImpl :: RLProxy xs -> List String

instance nilKeys :: Keys Nil where
  keysImpl _ = mempty

instance consKeys ::
  ( IsSymbol name
  , Keys tail
  ) => Keys (Cons name ty tail) where
  keysImpl _ = first : rest
    where
      first = reflectSymbol (SProxy :: SProxy name)
      rest = keysImpl (RLProxy :: RLProxy tail)

因此，我可以一起定义一个这样的函数来获取共享标签：

getSharedLabels
  :: forall r1 rl1 r2 rl2 rl
  . RowToList r1 rl1
  => RowToList r2 rl2
  => RowListIntersection rl1 rl2 rl
  => Keys rl
  => Record r1
  -> Record r2
  -> List String
getSharedLabels _ _ = keysImpl (RLProxy :: RLProxy rl)

然后我们可以看到我们期望的结果：

main = do
  logShow <<< Array.fromFoldable $
    getSharedLabels
      { a: 123, b: "abc" }
      { a: 123, b: "abc", c: true }
  -- logs out ["a","b"] as expected

如果您是 RowList/RowToList 的新手，您可以考虑阅读我的RowList Fun With PureScript 2nd Edition幻灯片。

我把这个答案的代码放在这里。

如果这一切似乎太复杂，您的其他解决方案可能是将记录强制转换为 String Map 并获取键的集合并集。我不知道这是否是 Elm 中的答案，因为 String Map 的运行时表示可能与 Record 的不匹配。但对于 PureScript，这是一种选择，因为 StrMap 的运行时表示与 Record 相同。

score 2 · Accepted Answer

实际上，在考虑了更多之后，我想可以在现代 Haskell 中做你真正想做的事情，如果你真正想做的是在类型级别使用具有命名字段的记录类型，包括做事就像使用来自其他两个记录的公共字段的新记录类型的编译时派生一样。

它有点复杂而且有点难看，尽管有些部分的效果出奇的好。是的，当然它“对于这样一个简单的任务来说太过繁琐了”，但请记住，我们正在尝试实现一个全新的、非平凡的、类型级别的功能（一种依赖结构类型）。使这成为一项简单任务的唯一方法是从一开始就将该功能融入语言及其类型系统中；否则，它会很复杂。

无论如何，在我们获得DependentTypes扩展之前，您必须明确启用少量（哈哈）扩展：

{-# LANGUAGE AllowAmbiguousTypes       #-}
{-# LANGUAGE GADTs                     #-}
{-# LANGUAGE KindSignatures            #-}
{-# LANGUAGE ScopedTypeVariables       #-}
{-# LANGUAGE TemplateHaskell           #-}
{-# LANGUAGE TypeApplications          #-}
{-# LANGUAGE TypeFamilies              #-}
{-# LANGUAGE TypeInType                #-}
{-# LANGUAGE TypeOperators             #-}
{-# LANGUAGE UndecidableInstances      #-}
{-# OPTIONS_GHC -Wincomplete-patterns  #-}

module Records where

我们将充分利用singletons包及其子模块：Prelude用于基本类型级函数，如Map、Fst和Lookup; 使用 Template Haskell 拼接生成我们自己的单例和提升函数的TH模块；以及TypeLits使用Symbol类型（即类型级别的字符串文字）。

import Data.Singletons.Prelude
import Data.Singletons.TH
import Data.Singletons.TypeLits

我们还需要其他一些零碎的东西。 Text仅需要，因为它是Symbol.

import Data.Function ((&))
import Data.Kind (Type)
import Data.List (intersect)
import qualified Data.Text as Text

我们将无法使用通常的 Haskell 记录。相反，我们将定义一个Record类型构造函数。此类型构造函数将由(Symbol, Type)对列表索引，其中Symbol给出字段名称，并Type给出存储在该字段中的值的类型。

data Record :: [(Symbol, Type)] -> Type where

这个设计决定已经有几个主要影响：

不同记录类型中相同的字段名可以引用不同的字段值类型。
字段在记录中是有序的，因此只有当它们具有相同的字段、相同的类型、相同的顺序时，记录类型才是相同的。
同一字段可以在一条记录中出现多次，即使我们提供的访问器功能只会访问一个（最后添加的）。

在依赖类型的程序中，设计决策往往很深入。例如，如果同一个字段不能多次出现，我们需要找到一种方法在类型中反映这一点，然后确保我们所有的函数都能够提供适当的证据来证明没有添加重复的字段.

无论如何，回到我们的Record类型构造函数。会有两个数据构造函数，一个Record创建空记录的构造函数：

  Record :: Record '[]

以及With将字段添加到记录的构造函数：

  With :: SSymbol s -> t -> Record fs -> Record ('(s, t) : fs)

请注意，With需要以s :: Symbol符号单例的形式在运行时表示SSymbol s 方便函数with_将使此单例隐式：

with_ :: forall s t fs . (SingI s) => t -> Record fs -> Record ('(s, t) : fs)
with_ = With sing

通过允许模糊类型和使用类型应用程序的想法，我们公开了以下合理简洁的语法来定义记录。显式类型签名在这里不是必需的，但包括在内是为了清楚地表明正在创建什么：

rec1 :: Record '[ '("bar", [Char]), '("foo", Int)]
rec1 = Record & with_ @"foo" (10 :: Int)
              & with_ @"bar" "Hello, world"
-- i.e., rec1 = { foo = 10, bar = "Hello, world" } :: { foo :: Int, bar :: String }

rec2 :: Record '[ '("quux", Maybe Double), '("foo", Int)]
rec2 = Record & with_ @"foo" (20 :: Int)
              & with_ @"quux" (Just 1.0 :: Maybe Double)
-- i.e., rec2 = { foo = 20, quux = Just 1.0 } :: { foo :: Int, quux :: Maybe Double }

为了证明这种记录类型是有用的，我们将定义一个类型安全的字段访问器。这是一个使用显式单例来选择字段的方法：

field :: forall s t fs . (Lookup s fs ~ Just t) => SSymbol s -> Record fs -> t
field s (With s' t r)
  = case s %:== s' of
      STrue -> t
      SFalse -> field s r

和一个隐含单例的助手：

field_ :: forall s t fs . (Lookup s fs ~ Just t, SingI s) => Record fs -> t
field_ = field @s sing

它旨在与这样的类型应用程序一起使用：

exField = field_ @"foo" rec1

请注意，尝试访问不存在的字段不会进行类型检查。错误消息并不理想，但至少它是一个编译时错误：

-- badField = field_ @"baz" rec1  -- gives: Couldn't match type Nothing with Just t

的定义field暗示了singletons图书馆的力量。我们正在使用通过 Template Haskell 从术语级定义自动生成的类型级Lookup函数，该定义看起来与以下完全一样（取自singletons源代码并重命名以避免冲突）：

lookup'                  :: (Eq a) => a -> [(a,b)] -> Maybe b
lookup' _key []          =  Nothing
lookup'  key ((x,y):xys) = if key == x then Just y else lookup' key xys

仅使用 context Lookup s fs ~ Just t，GHC 能够确定：

因为上下文暗示该字段将在列表中找到，所以的第二个参数field永远不能是空记录Record，所以没有关于不完整模式的警告field，事实上，如果你尝试处理这个，你会得到一个类型错误通过添加案例作为运行时错误：field s Record = error "ack, something went wrong!"
field如果我们在SFalse分支中，递归调用是类型正确的。也就是说，GHC 已经想通了，如果我们能够成功地在列表中找到Lookupkeys但它不在头部，我们必须能够在尾部查找它。

（这对我来说很神奇，但无论如何......）

这些是我们记录类型的基础。Names为了在运行时或编译时检查字段名称，我们将引入一个帮助器，我们将使用 Template Haskell将其提升到类型级别（即类型级别函数）：

$(singletons [d|
  names :: [(Symbol, Type)] -> [Symbol]
  names = map fst
  |])

请注意，类型级别函数Names可以提供对记录字段名称的编译时访问，例如在假设的类型签名中：

data SomeUIType fs = SomeUIType -- a UI for the given compile-time list of fields
recordUI :: Record fs -> SomeUIType (Names fs)
recordUI _ = SomeUIType

不过，更有可能的是，我们希望在运行时使用字段名称。使用Names，我们可以定义以下函数来获取记录并将其字段名称列表作为单例返回。这里SNil和SCons是术语[]和的单例等价物(:)。

sFields :: Record fs -> Sing (Names fs)
sFields Record = SNil
sFields (With s _ r) = SCons s (sFields r)

这是一个返回 a[Text]而不是单例的版本。

fields :: Record fs -> [Text.Text]
fields = fromSing . sFields

现在，如果您只想获取两条记录的公共字段的运行时列表，您可以执行以下操作：

rec12common = intersect (fields rec1) (fields rec2)
-- value:  ["foo"]

在编译时创建具有公共字段的类型怎么样？好吧，我们可以定义以下函数来获取具有通用名称的左偏字段集。（从某种意义上说，如果两个记录中的匹配字段具有不同的类型，则它是“左偏”的，它将采用第一个记录的类型。）再次，我们使用singletons包和 Template Haskell 将其提升到类型级别Common功能：

$(singletons [d|
  common :: [(Symbol,Type)] -> [(Symbol,Type)] -> [(Symbol,Type)]
  common [] _ = []
  common (x@(a,b):xs) ys
    = if elem a (map fst ys)
      then x:common xs ys
      else   common xs ys
  |])

这允许我们定义一个函数，该函数接受两条记录并将第一条记录简化为与第二条记录中的字段同名的字段集：

reduce :: Record fs1 -> Record fs2 -> Record (Common fs1 fs2)
reduce Record _ = Record
reduce (With s x r1) r2
  = case sElem s (sFields r2) of STrue  -> With s x (reduce r1 r2)
                                 SFalse -> reduce r1 r2

再一次，单例库在这里真的很了不起。我正在使用自动生成Common的类型级函数和单例级sElem函数（它是在singletons包中从函数的术语级定义自动生成的elem）。不知何故，通过所有这些复杂性，GHC 可以计算出，如果sElem计算结果为STrue，我必须将其包含s在公共字段列表中，而如果计算结果为，则SFalse不能。试着摆弄箭头右侧的案例结果——如果你弄错了，你不能让他们输入检查！

无论如何，我可以将此功能应用于我的两个示例记录。同样，不需要类型签名，但可以显示正在生成的内容：

rec3 :: Record '[ '("foo", Int)]
rec3 = reduce rec1 rec2

像任何其他记录一样，我可以在运行时访问其字段名称，并在编译时对字段访问进行类型检查：

-- fields rec3           gives  ["foo"], the common field names
-- field_ @"foo" rec3    gives  10, the field value for rec1

请注意，一般情况下，如果常用名称字段的顺序和/或类型在和之间不同，则reduce r1 r2和reduce r2 r1不仅会返回不同的值，还会返回不同的类型。改变这种行为可能需要重新审视我之前提到的那些早期且影响深远的设计决策。r1r2

为方便起见，这是整个程序，使用 Stack lts-10.5（使用单例 2.3.1）进行测试：

{-# LANGUAGE AllowAmbiguousTypes       #-}
{-# LANGUAGE GADTs                     #-}
{-# LANGUAGE KindSignatures            #-}
{-# LANGUAGE ScopedTypeVariables       #-}
{-# LANGUAGE TemplateHaskell           #-}
{-# LANGUAGE TypeApplications          #-}
{-# LANGUAGE TypeFamilies              #-}
{-# LANGUAGE TypeInType                #-}
{-# LANGUAGE TypeOperators             #-}
{-# LANGUAGE UndecidableInstances      #-}
{-# OPTIONS_GHC -Wincomplete-patterns  #-}

module Records where

import Data.Singletons.Prelude
import Data.Singletons.TH
import Data.Singletons.TypeLits
import Data.Function ((&))
import Data.Kind (Type)
import Data.List (intersect)
import qualified Data.Text as Text

data Record :: [(Symbol, Type)] -> Type where
  Record :: Record '[]
  With :: SSymbol s -> t -> Record fs -> Record ('(s, t) : fs)

with_ :: forall s t fs . (SingI s) => t -> Record fs -> Record ('(s, t) : fs)
with_ = With sing

rec1 :: Record '[ '("bar", [Char]), '("foo", Int)]
rec1 = Record & with_ @"foo" (10 :: Int)
              & with_ @"bar" "Hello, world"
-- i.e., rec1 = { foo = 10, bar = "Hello, world" } :: { foo :: Int, bar :: String }

rec2 :: Record '[ '("quux", Maybe Double), '("foo", Int)]
rec2 = Record & with_ @"foo" (20 :: Int)
              & with_ @"quux" (Just 1.0 :: Maybe Double)
-- i.e., rec2 = { foo = 20, quux = Just 1.0 } :: { foo :: Int, quux :: Maybe Double }

field :: forall s t fs . (Lookup s fs ~ Just t) => SSymbol s -> Record fs -> t
field s (With s' t r)
  = case s %:== s' of
      STrue -> t
      SFalse -> field s r

field_ :: forall s t fs . (Lookup s fs ~ Just t, SingI s) => Record fs -> t
field_ = field @s sing

exField = field_ @"foo" rec1
-- badField = field_ @"baz" rec1  -- gives: Couldn't match type Nothing with Just t

lookup'                  :: (Eq a) => a -> [(a,b)] -> Maybe b
lookup' _key []          =  Nothing
lookup'  key ((x,y):xys) = if key == x then Just y else lookup' key xys

$(singletons [d|
  names :: [(Symbol, Type)] -> [Symbol]
  names = map fst
  |])

data SomeUIType fs = SomeUIType -- a UI for the given compile-time list of fields
recordUI :: Record fs -> SomeUIType (Names fs)
recordUI _ = SomeUIType

sFields :: Record fs -> Sing (Names fs)
sFields Record = SNil
sFields (With s _ r) = SCons s (sFields r)

fields :: Record fs -> [Text.Text]
fields = fromSing . sFields

rec12common = intersect (fields rec1) (fields rec2)
-- value:  ["foo"]

$(singletons [d|
  common :: [(Symbol,Type)] -> [(Symbol,Type)] -> [(Symbol,Type)]
  common [] _ = []
  common (x@(a,b):xs) ys
    = if elem a (map fst ys)
      then x:common xs ys
      else   common xs ys
  |])

reduce :: Record fs1 -> Record fs2 -> Record (Common fs1 fs2)
reduce Record _ = Record
reduce (With s x r1) r2
  = case sElem s (sFields r2) of STrue  -> With s x (reduce r1 r2)
                                 SFalse -> reduce r1 r2

rec3 :: Record '[ '("foo", Int)]
rec3 = reduce rec1 rec2
-- fields rec3           gives  ["foo"], the common field names
-- field_ @"foo" rec3    gives  10, the field value for rec1

score -2 · Accepted Answer

好吧，既然你的函数真的返回一个字符串数组，那么返回类型应该只是Array String.

参数的类型将是遗传的，因为您事先不知道类型。如果你真的想确保这些类型实际上是记录，你可以让你的泛型参数不是记录本身，而是键入 rows，然后键入 value 参数 as Record a。

所以：

myImaginaryFunction :: forall a b. Record a -> Record b -> Array String

这就是您键入此类函数的方式。

或者你的问题真的是关于如何实施它？

另外：你有没有注意到作弊（通过添加 Haskell 标签）并没有真正给你带来任何帮助，而只是一些责骂？请不要这样做。尊重社区。

haskell - 推断两条记录中公共字段的类型

4 回答 4

Related

Reference