c# - 具有不断变化的数据类型的管道在架构上是否合理？

Question

我正在研究本质上是一个文档解析和分析框架的架构。给定文档的行，框架最终将生成一个Document表示文档的大对象（称为它）。

管道中的早期过滤器将需要逐行运行。但是，进一步向下的过滤器将需要转换（并最终生成）Document对象。

为了实现这一点，我正在考虑使用这样的过滤器定义：

public interface IFilter<in TIn, out TOut> {
    TOut Execute(TIn data);
}

所有过滤器都将注册到一个PipelineManager类（与使用“链表”样式方法相反）。在执行之前，PipelineManager将验证管道的完整性，以确保没有过滤器被赋予错误的输入类型。

我的问题：拥有一个数据类型不断变化的管道在架构上是否合理（即一个好主意）？

PS 我将我的应用程序实现为管道的原因是因为我觉得插件作者很容易替换/扩展现有的过滤器。只需将要更改的过滤器换成不同的实现，就可以了。

score 4 · Accepted Answer

编辑：注意，已删除其他答案以替换此 wall'o'text grin

NINJAEDIT：有趣的事实：Powershell（在@Loudenvier 的回答中提到）曾经被命名为“Monad”——此外，发现 Wes Dyer 的博客文章主题为：Monads 的奇迹

查看整个“Monad”事物的一种非常非常简单的方式是将其视为具有非常基本界面的盒子：

返回
绑定
零（可选）

使用在概念上同样简单——假设你有一个“东西”：

您可以将您的“东西”包装在盒子中（这将是“返回”）并拥有一个“BoxOfThing”
您可以指导如何将物品从这个盒子中取出并放入另一个盒子（装订）
您可以获得一个空框（“零”：将其视为一种“无操作”，例如乘以一或加零）
（还有其他规则，但这三个是最有趣的）

绑定位是真正有趣的部分，也是让大多数人脑袋爆炸的部分；基本上，您给出了如何将盒子链接在一起的各种规范：让我们采用一个相当简单的 Monad，“Option”或“Maybe” - 有点像Nullable<T>，但更酷。

所以每个人都讨厌到处检查 null，但由于引用类型的工作方式，我们不得不这样做；我们希望能够编写这样的代码：

var zipcodesNearby = order.Customer.Address.City.ZipCodes;

如果（客户有效 + 地址有效 + ...），则要么返回有效答案，要么如果该逻辑的任何部分失败，则返回“无”......但不，我们需要：

List<string> zipcodesNearBy = new List<string>();
if(goodOrder.Customer != null)
{
    if(goodOrder.Customer.Address != null)
    {
        if(goodOrder.Customer.Address.City != null)
        {
            if(goodOrder.Customer.Address.City.ZipCodes != null)
            {
                zipcodesNearBy = goodOrder.Customer.Address.City.ZipCodes;
            }
            else { /* do something else? throw? */ }
        }
        else { /* do something else? throw? */ }
    }
    else { /* do something else? throw? */ }
}
else { /* do something else? throw? */ }

（注意：你也可以依赖空合并，如果适用的话——虽然它看起来很讨厌）

List<string> nullCoalescingZips = 
    ((((goodOrder ?? new Order())
        .Customer ?? new Person())
            .Address ?? new Address())
                .City ?? new City())
                    .ZipCodes ?? new List<string>();

Maybe monad “规则”可能看起来有点像：

（注意：C# 不适合这种类型的类型修改，所以它有点不稳定）

public static Maybe<T> Return(T value)
{
    return ReferenceEquals(value, null) ? Maybe<T>.Nothing : new Maybe<T>() { Value = value };
}
public static Maybe<U> Bind<U>(Maybe<T> me, Func<T, Maybe<U>> map)
{
    return me != Maybe<T>.Nothing ?
        // extract, map, and rebox
        map(me.Value) :
        // We have nothing, so we pass along nothing...
        Maybe<U>.Nothing;
}

但这会导致一些讨厌的代码：

var result1 = 
    Maybe<string>.Bind(Maybe<string>.Return("hello"), hello =>
        Maybe<string>.Bind(Maybe<string>.Return((string)null), doh =>
            Maybe<string>.Bind(Maybe<string>.Return("world"), world =>
                hello + doh + world).Value
            ).Value
        );

幸运的是，有一个简洁的快捷方式：SelectMany大致相当于“绑定”：

如果我们实施SelectMany我们的Maybe<T>...

public class Maybe<T>
{
    public static readonly Maybe<T> Nothing = new Maybe<T>();
    private Maybe() {}
    public T Value { get; private set;}
    public Maybe(T value) { Value = value; }
}
public static class MaybeExt
{
    public static bool IsNothing<T>(this Maybe<T> me)
    {
        return me == Maybe<T>.Nothing;
    }
    public static Maybe<T> May<T>(this T value)
    {
        return ReferenceEquals(value, null) ? Maybe<T>.Nothing : new Maybe<T>(value);
    }
    // Note: this is basically just "Bind"
    public static Maybe<U> SelectMany<T,U>(this Maybe<T> me, Func<T, Maybe<U>> map)
    {
        return me != Maybe<T>.Nothing ?
            // extract, map, and rebox
            map(me.Value) :
            // We have nothing, so we pass along nothing...
            Maybe<U>.Nothing;
    }
    // This overload is the one that "turns on" query comprehension syntax...
    public static Maybe<V> SelectMany<T,U,V>(this Maybe<T> me, Func<T, Maybe<U>> map, Func<T,U,V> selector)
    {
        return me.SelectMany(x => map(x).SelectMany(y => selector(x,y).May()));
    }
}

现在我们可以搭载 LINQ 理解语法！

var result1 = 
    from hello in "Hello".May()
    from oops in ((string)null).May()
    from world in "world".May()
    select hello + oops + world;
// prints "Was Nothing!"
Console.WriteLine(result1.IsNothing() ? "Was Nothing!" : result1.Value);

var result2 = 
    from hello in "Hello".May()
    from space in " ".May()
    from world in "world".May()
    select hello + space + world;
// prints "Hello world"
Console.WriteLine(result2.IsNothing() ? "Was Nothing!" : result2.Value);

var goodOrder = new Order { Customer = new Person { Address = new Address { City = new City { ZipCodes = new List<string>{"90210"}}}}};
var badOrder = new Order { Customer = new Person { Address = null }};

var zipcodesNearby = 
    from ord in goodOrder.May()
    from cust in ord.Customer.May()     
    from add in cust.Address.May()
    from city in add.City.May()
    from zip in city.ZipCodes.May()
    select zip;
// prints "90210"
Console.WriteLine(zipcodesNearby.IsNothing() ? "Nothing!" : zipcodesNearby.Value.FirstOrDefault());

var badZipcodesNearby = 
    from ord in badOrder.May()
    from cust in ord.Customer.May()     
    from add in cust.Address.May()
    from city in add.City.May()
    from zip in city.ZipCodes.May()
    select zip;
// prints "Nothing!"
Console.WriteLine(badZipcodesNearby.IsNothing() ? "Nothing!" : badZipcodesNearby.Value.FirstOrDefault());

哈，刚刚意识到我忘了提到这一点......所以基本上，一旦你弄清楚管道的每个阶段的“绑定”等价物是什么，你就可以使用相同类型的伪单子代码来处理每个类型转换的包装、展开和处理。

score 2 · Accepted Answer

这不会回答您的问题，但是在 .NET 世界中寻找管道灵感的好地方是 PowerShell。他们以一种非常巧妙的方式实现了管道模型，并且流经管道的对象会一直在变化。

过去，我必须生成一个数据库到 PDF 文档创建管道，并将其作为 PowerShell 命令行开关来完成。它是如此可扩展，以至于多年后它仍然被积极使用和开发，它只是从 PowerShell 1 迁移到 2，现在可能迁移到 3。

您可以在这里获得好主意：http: //blogs.technet.com/b/heyscriptingguy/

c# - 具有不断变化的数据类型的管道在架构上是否合理？

2 回答 2

Related

Reference