2

我正在寻找一个查询来帮助清理一个没有很多关系的旧数据库。我不需要它是完美的,只是为了帮助指导我开始清理并开始强制执行数据完整性。

我假设所有表都具有正确的主键,并且尚未相关的表中的列具有相同的名称。

从理论上讲,我可以拥有三个表,其中一个具有复合键(我不会选择这样设计数据库,但在清理它时受到限制,并且这些类型的复合/主/外键很常见):

Case.CaseId (PK)
Workstep.WorkstepId (PK)
Workstep.CaseId (PK,FK)
WorkQueue.CaseId (与 Ca​​se.CaseId 无关,但应该是)

我想做的是运行一个查询并得出结果,这些结果给我一些类似表名、列名和不相关但应该是的表的外键,例如:

表名、列名,应与主键
WorkQueue、CaseId、Case.CaseId 相关

请参阅下面我使用的 SQL,但它返回任何主键,即使是既是主键又是外键一部分的主键。再次使用我的示例和下面的 SQL,而不是返回 1 行,我得到 2:

表名,列名,应该与主键
WorkQueue、CaseId、Workstep.CaseId 相关(我不想要这一行,因为它也与“真实”主键 Case.CaseId 相关)
WorkQueue、CaseId、Case。案例ID

    SELECT 
    SubqueryAllPotentialForeignKeys.TABLE_NAME
    ,SubqueryAllPotentialForeignKeys.COLUMN_NAME
    ,(PrimaryKeys.TABLE_NAME + '.' + PrimaryKeys.COLUMN_NAME) as 'Possible Primary Key'

--all potential foreign keys (column name matches another column name but there is no reference from this column anywhere else)
FROM
    (   
    SELECT
        INFORMATION_SCHEMA.COLUMNS.TABLE_NAME
        ,INFORMATION_SCHEMA.COLUMNS.COLUMN_NAME
    FROM
        INFORMATION_SCHEMA.COLUMNS
    WHERE
        --only get columns that are in multiple tables
        INFORMATION_SCHEMA.COLUMNS.COLUMN_NAME IN 
        (
            SELECT COLUMN_NAME FROM
                (SELECT COLUMN_NAME, COUNT(COLUMN_NAME) AS ColNameCount FROM INFORMATION_SCHEMA.COLUMNS GROUP BY COLUMN_NAME) AS SubQueryColumns
            WHERE ColNameCount > 1
        )

        --only get the table.column if not part of a foreign or primary key
        EXCEPT
        (
            SELECT TABLE_NAME, COLUMN_NAME  FROM INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE 
        )

    ) AS SubqueryAllPotentialForeignKeys

LEFT JOIN INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE AS PrimaryKeys ON 
    SubqueryAllPotentialForeignKeys.COLUMN_NAME = PrimaryKeys.COLUMN_NAME

--when finding possible keys for our columns that don't have references, limit to primary keys
WHERE 
    PrimaryKeys.CONSTRAINT_NAME LIKE '%PK_%'

ORDER BY TABLE_NAME, COLUMN_NAME
4

1 回答 1

1

这可能不是世界上最美丽的东西,但效果很好:

    SELECT * FROM
(
    SELECT 
        SubqueryAllPotentialForeignKeys.TABLE_NAME
        ,SubqueryAllPotentialForeignKeys.COLUMN_NAME
        ,(PrimaryKeys.TABLE_NAME + '.' + PrimaryKeys.COLUMN_NAME) as 'Possible Primary Key'

    --all potential foreign keys (column name matches another column name but there is no reference from this column anywhere else)
    FROM
        (   
        SELECT
            INFORMATION_SCHEMA.COLUMNS.TABLE_NAME
            ,INFORMATION_SCHEMA.COLUMNS.COLUMN_NAME
        FROM
            INFORMATION_SCHEMA.COLUMNS
        WHERE
            --only get columns that are in multiple tables
            INFORMATION_SCHEMA.COLUMNS.COLUMN_NAME IN 
            (
                SELECT COLUMN_NAME FROM
                    (SELECT COLUMN_NAME, COUNT(COLUMN_NAME) AS ColNameCount FROM INFORMATION_SCHEMA.COLUMNS GROUP BY COLUMN_NAME) AS SubQueryColumns
                WHERE ColNameCount > 1
            )

            --only get the table.column if not part of a foreign or primary key
            EXCEPT
            (
                SELECT TABLE_NAME, COLUMN_NAME  FROM INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE 
            ) 

        ) AS SubqueryAllPotentialForeignKeys

    LEFT JOIN INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE AS PrimaryKeys ON 
        SubqueryAllPotentialForeignKeys.COLUMN_NAME = PrimaryKeys.COLUMN_NAME

    --when finding possible keys for our columns that don't have references, limit to primary keys
    WHERE 
        PrimaryKeys.CONSTRAINT_NAME LIKE '%PK_%'
) AS Subquery

--exclude all keys that are primary but also foreign
WHERE [Possible Primary Key] NOT IN
    (
        SELECT (TABLE_NAME + '.' + COLUMN_NAME)
        FROM INFORMATION_SCHEMA.KEY_COLUMN_USAGE
        WHERE CONSTRAINT_NAME LIKE 'FK_%'
    ) 

ORDER BY TABLE_NAME, COLUMN_NAME
于 2012-07-26T14:05:02.840 回答