2
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Packages>
    <Package Name="Extraction_RecordCount" ConstraintMode="Parallel">
        <Tasks>
            <ExecuteSQL Name="Extraction_RecordCount" ConnectionName="Target">
                <DirectInput> <![CDATA[ Truncate table CMC.Extraction_RecordCount ]]> </DirectInput>
            </ExecuteSQL>
            <Dataflow Name="Fill Extraction_RecordCount">
                <PrecedenceConstraints>
                    <Inputs>
                        <Input OutputPathName="Extraction_RecordCount.Output" />
                    </Inputs>
                </PrecedenceConstraints>
                <Transformations>
                    <OleDbSource Name="ExtractedTables" ConnectionName="Target" >
                        <DirectInput>
                            <![CDATA[
                            SELECT cast( sysobjects.NAME as nvarchar(128)) as TableName 
                                ,sysindexes.Rows as #Rows 
                            FROM sysobjects 
                            INNER JOIN sysindexes ON sysobjects.id = sysindexes.id 
                            INNER JOIN ( SELECT c.table_name ,c.table_schema FROM information_schema.columns c GROUP BY c.table_name ,c.table_schema) c ON c.table_name = sysobjects.NAME 
                            WHERE type = 'U' 
                                AND sysindexes.IndId < 2 
                                AND c.table_schema = 'EXT' 
                            ORDER BY TableName, #Rows
                            ]]>
                        </DirectInput>
                    </OleDbSource>
                    <OleDbSource Name="BackOffice" ConnectionName="Source" >
                        <DirectInput> <![CDATA[ select TABLE_NAME  , cast(NUM_ROWS as int) as NUM_ROWS from ALL_ALL_TABLES ORDER BY TABLE_NAME, NUM_ROWS]]> </DirectInput>
                    </OleDbSource>
                    <MergeJoin Name="Join Extracted Tables w BACKOFFICE" JoinType="InnerJoin">
                        <LeftInputPath OutputPathName="ExtractedTables.Output">
                            <Columns>
                                <Column SourceColumn="TableName" SortKeyPosition="1"/>
                                <Column SourceColumn="#Rows" SortKeyPosition="2"/>
                            </Columns>
                        </LeftInputPath>
                        <RightInputPath OutputPathName="BackOffice.Output">
                            <Columns>
                                <Column SourceColumn="TABLE_NAME" SortKeyPosition="1"/>
                                <Column SourceColumn="NUM_ROWS" SortKeyPosition="2" />
                            </Columns>
                        </RightInputPath>
                        <JoinKeys>
                            <JoinKey LeftColumn="TableName" RightColumn="TABLE_NAME" />
                        </JoinKeys>
                    </MergeJoin>
                    <OleDbDestination Name="Extraction_RecordCount" ConnectionName="Target">
                        <ExternalTableOutput Table="CMC.Extraction_RecordCount"/>
                    </OleDbDestination>
                </Transformations>
            </Dataflow>
        </Tasks>
    </Package>
</Packages>

此代码确实生成了包“Extraction_RecordCount”,但“Merge Join”组件抛出了一个错误,指出必须对两个源的输入进行排序。手动设置 'IsSorted' = 'True' 并设置 'SortKeyPosition' 可以暂时解决问题。

插入排序组件也不起作用。

4

1 回答 1

0

Merge Join 的要求是您的源已排序。您当前的代码指定的是合并连接转换的输出已排序。相反,您希望指示 Merge Join 的输入已排序。

您的源数据已排序,我看到两者都有明确ORDER BY的操作。您缺少的是对源组件进行排序的规范。

                <OleDbSource Name="ExtractedTables" ConnectionName="Target" >
                    <DirectInput>
                        <![CDATA[
                        SELECT cast( sysobjects.NAME as nvarchar(128)) as TableName 
                            ,sysindexes.Rows as #Rows 
                        FROM sysobjects 
                        INNER JOIN sysindexes ON sysobjects.id = sysindexes.id 
                        INNER JOIN ( SELECT c.table_name ,c.table_schema FROM information_schema.columns c GROUP BY c.table_name ,c.table_schema) c ON c.table_name = sysobjects.NAME 
                        WHERE type = 'U' 
                            AND sysindexes.IndId < 2 
                            AND c.table_schema = 'EXT' 
                        ORDER BY TableName, #Rows
                        ]]>
                    </DirectInput>
                     <Columns>
                         <Column SourceColumn="TableName" SortKeyPosition="1"></Column>
                         <Column SourceColumn="#Rows" SortKeyPosition="2"></Column>
                     </Columns>
                </OleDbSource>
                <OleDbSource Name="BackOffice" ConnectionName="Source" >
                    <DirectInput> <![CDATA[ select TABLE_NAME  , cast(NUM_ROWS as int) as NUM_ROWS from ALL_ALL_TABLES ORDER BY TABLE_NAME, NUM_ROWS]]> </DirectInput>
                        <Columns>
                            <Column SourceColumn="TABLE_NAME" SortKeyPosition="1"></Column>
                            <Column SourceColumn="NUM_ROWS" SortKeyPosition="2"></Column>
                        </Columns>
                </OleDbSource>

我不是 100% 认为#Rows您的第一个查询中的实际名称是有效的,但重要的是将其标记为按列名排序

我对这个 DBA.StackExchange.com 问题的回答有一个完整的端到端合并连接示例

于 2014-10-06T19:25:52.477 回答