2

我想编写一个 T-SQL 函数,它可以测试表中是否存在重复的行集,其中比较了一些列而忽略了一些列。

例如,考虑以下数据集:

BomID   PartNumber  ItemNumber  Quantity    UnitID
4164    10004001    10001419        1         33
4169    10004001    103599          1         33
4171    10004001    103601          1         33
4163    10004001    10001329       10         33
4166    10004001    101823          8         33
10794   10012161    10001419        1         33
10799   10012161    103599          1         33
10801   10012161    103601          1         33
10793   10012161    10001329       10         33
10796   10012161    101823          8         33

我想编写一个函数Bom.f_GetPartsThatHaveAnIdenticalBom(partNumber),当通过10004001可以有效地检测到10012161具有通过比较 tuple 确定的重复记录(ItemNumber, Quantity, UnitID)。忽略关键字段BomID。因此,该函数将返回具有相同 BOM 的零件编号(如果有)的不同列表。

我已经使用各种技术手动完成了这个操作。但是由于我似乎越来越频繁地需要这个例程,所以我希望有一个基于集合、高效并且可以与 LINQ to Entities 查询中的其他表组合的函数。

4

3 回答 3

2

以下查询使用 afull outer join来比较这两个集合。任何不匹配的记录都会在一侧或另一侧产生 NULL 值。子句中的比较having过滤掉了这些。

SELECT b1.PartNumber, b2.PartNumber AS TargetPartNumber
FROM bom b full outer join
     bom b2
     ON b1.ItemNumber = b2.ItemNumber AND
        b1.Quantity = b2.Quantity and
        b1.UnitID = b2.UnitID and
        b1.PartNumber <> b2.PartNumber
WHERE b1.PartNumber = @PartNumber
GROUP BY b1.PartNumber, b2.PartNumber
having count(*) = count(b1.PartNumber) and
       count(*) = count(b2.PartNumber)

您可以通过在 (itemnumber, quantity, unitid, partnumber) 上设置索引来提高效率。

于 2013-01-11T02:35:53.890 回答
1

这是基于鲍勃答案的修改版本的完整解决方案:

DECLARE @PartNumber AS udt_PartNumber; SET @PartNumber = N'10012163';

SELECT DISTINCT bom2.TargetPartNumber
FROM
    (
    SELECT PartNumber, COUNT(*) AS ItemCount
    FROM Part.BillsOfMaterials
    WHERE PartNumber = @PartNumber
    GROUP BY PartNumber
    ) AS bom1
JOIN
    (
    SELECT b1.PartNumber, b2.PartNumber AS TargetPartNumber, COUNT(*) AS ItemCount
    FROM Part.BillsOfMaterials b1
    RIGHT JOIN Part.BillsOfMaterials b2 ON b1.ItemNumber = b2.ItemNumber
                AND b1.Quantity = b2.Quantity
                AND b1.UnitID = b2.UnitID
                AND b1.PartNumber <> b2.PartNumber
    WHERE b1.PartNumber = @PartNumber
    GROUP BY b1.PartNumber, b2.PartNumber
    ) AS bom2 ON bom1.PartNumber = bom2.PartNumber
                AND bom1.ItemCount = bom2.ItemCount
WHERE bom1.ItemCount = (SELECT COUNT(*) FROM Part.BillsOfMaterials WHERE PartNumber = bom2.TargetPartNumber)
ORDER BY bom2.TargetPartNumber

唯一的区别是最后的 WHERE 子句确保如果目标包含源零件编号的 BOM 中不存在的额外行,则不会找到匹配项。

于 2013-01-11T17:56:53.680 回答
1

这是一个可能对您有用的 SQL 语句。

DECLARE @PartNumber int = 10004001

SELECT DISTINCT bom2.TargetPartNumber
FROM
    (
    SELECT PartNumber, COUNT(*) AS ItemCount
    FROM bom
    WHERE PartNumber = @PartNumber
    GROUP BY PartNumber
    ) AS bom1
JOIN
    (
    SELECT b1.PartNumber, b2.PartNumber AS TargetPartNumber, COUNT(*) AS ItemCount
    FROM bom b1
    JOIN bom b2 ON b1.ItemNumber = b2.ItemNumber
                AND b1.Quantity = b2.Quantity
                AND b1.UnitID = b2.UnitID
                AND b1.PartNumber <> b2.PartNumber
    WHERE b1.PartNumber = @PartNumber
    GROUP BY b1.PartNumber, b2.PartNumber
    ) AS bom2 ON bom1.PartNumber = bom2.PartNumber
                AND bom1.ItemCount = bom2.ItemCount
WHERE bom1.ItemCount = (SELECT COUNT(*) FROM bom WHERE PartNumber = bom2.TargetPartNumber)
ORDER BY bom2.TargetPartNumber

您可以将其放入存储过程或函数中。@PartNumber表示您将传递给函数的值。

于 2013-01-11T02:21:07.160 回答