0

嗯,正如标题所说。我想使用脚本组件目标,然后利用 LINQ 选择要处理哪些行以进行输出。

对于更多背景知识,我将这个丑陋的合并事物与一对多关系。这些行看起来有点像:

[ID] [Title]   [OneToManyDataID]
1    Item one   2
1    Item one   4
1    Item one   3
3    Item two   1
3    Item two   5

我们将调用对象 [Item],它具有 ID 和 Title 列以及 [OneToMany]

我希望我可以把整个东西扔到一个脚本组件的目的地,然后使用 LINQ 做一些事情,比如按项目分组,只从最高的 OneToMany 对象中获取数据。有点像:

foreach(var item  in Data.GroupBy(d=>d.Item).Select(d=> new {Item = d.Key})){
     //Then pick out the highest OneToMany ID for that row to use with it.
}

我意识到可能有更好的 LINQ 查询来实现这一点,但关键是,SSIS 中的脚本组件似乎只允许使用预定义的 ProcessInputRow 方法在每行的基础上使用它。我想在哪里准确确定处理了哪些行以及将哪些属性传递给该方法。

我该怎么做呢?

4

1 回答 1

5

要重申您的问题,如何使脚本转换停止逐行处理?默认情况下,脚本转换将是一个同步组件 - 1 行输入,1 行输出。您需要将其更改为异步组件 1 行输入 - 0 到多行输出。

在脚本转换编辑器的 Inputs and Outputs 选项卡上,为您的输出集合Output 0将 SynchronousInputID 的值从它的任何值更改为None.

不要在我的 LINQ 代码上扔石头——我相信你可以处理好这项工作。此代码块的目的是演示如何收集行进行处理,然后在修改它们后将它们传递给下游消费者。我对这些方法进行了评论,以帮助您了解它们每个人在脚本组件生命周期中的作用,但如果您更愿意阅读MSDN,他们知道的比我多一点;)

using System;
using System.Data;
using System.Linq;
using System.Collections.Generic;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;

[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
    /// <summary>
    /// Our LINQ-able thing.
    /// </summary>
    List<Data> data;

    /// <summary>
    /// Do our preexecute tasks, in particular, we will instantiate
    /// our collection.
    /// </summary>
    public override void PreExecute()
    {
        base.PreExecute();
        this.data = new List<Data>();
    }

    /// <summary>
    /// This method is called once the last row has hit.
    /// Since we will can only find the highest OneToManyDataId
    /// after receiving all the rows, this the only time we can
    /// send rows to the output buffer.
    /// </summary>
    public override void FinishOutputs()
    {
        base.FinishOutputs();
        CreateNewOutputRows();
    }

    /// <summary>
    /// Accumulate all the input rows into an internal LINQ-able
    /// collection
    /// </summary>
    /// <param name="Row">The buffer holding the current row</param>
    public override void Input0_ProcessInputRow(Input0Buffer Row)
    {
        // there is probably a more graceful mechanism of spinning
        // up this struct.
        // You must also worry about fields that have null types.
        Data d = new Data();
        d.ID = Row.ID;
        d.Title = Row.Title;
        d.OneToManyId = Row.OneToManyDataID;            
        this.data.Add(d);
    }

    /// <summary>
    /// This is the process to generate new rows. As we only want to
    /// generate rows once all the rows have arrived, only call this
    /// at the point our internal collection has accumulated all the
    /// input rows.
    /// </summary>
    public override void CreateNewOutputRows()
    {
        foreach (var item in this.data.GroupBy(d => d.ID).Select(d => new { Item = d.Key }))
        {
            //Then pick out the highest OneToMany ID for that row to use with it.
            // Magic happens
            // I don't "get" LINQ so I can't implement the poster's action
            int id = 0;
            int maxOneToManyID = 2;
            string title = string.Empty;
            id = item.Item;
            Output0Buffer.AddRow();
            Output0Buffer.ID = id;
            Output0Buffer.OneToManyDataID = maxOneToManyID;
            Output0Buffer.Title = title;
        }
    }

}
/// <summary>
/// I think this works well enough to demo
/// </summary>
public struct Data
{
    public int ID { get; set; }
    public string Title { get; set; }
    public int OneToManyId { get; set; }
}

脚本转换的配置

输入选项卡

输出

结果

于 2012-10-02T14:10:20.513 回答