我正在寻找一个查询来帮助清理一个没有很多关系的旧数据库。我不需要它是完美的,只是为了帮助指导我开始清理并开始强制执行数据完整性。
我假设所有表都具有正确的主键,并且尚未相关的表中的列具有相同的名称。
从理论上讲,我可以拥有三个表,其中一个具有复合键(我不会选择这样设计数据库,但在清理它时受到限制,并且这些类型的复合/主/外键很常见):
Case.CaseId (PK)
Workstep.WorkstepId (PK)
Workstep.CaseId (PK,FK)
WorkQueue.CaseId (与 Case.CaseId 无关,但应该是)
我想做的是运行一个查询并得出结果,这些结果给我一些类似表名、列名和不相关但应该是的表的外键,例如:
表名、列名,应与主键
WorkQueue、CaseId、Case.CaseId 相关
请参阅下面我使用的 SQL,但它返回任何主键,即使是既是主键又是外键一部分的主键。再次使用我的示例和下面的 SQL,而不是返回 1 行,我得到 2:
表名,列名,应该与主键
WorkQueue、CaseId、Workstep.CaseId 相关(我不想要这一行,因为它也与“真实”主键 Case.CaseId 相关)
WorkQueue、CaseId、Case。案例ID
SELECT
SubqueryAllPotentialForeignKeys.TABLE_NAME
,SubqueryAllPotentialForeignKeys.COLUMN_NAME
,(PrimaryKeys.TABLE_NAME + '.' + PrimaryKeys.COLUMN_NAME) as 'Possible Primary Key'
--all potential foreign keys (column name matches another column name but there is no reference from this column anywhere else)
FROM
(
SELECT
INFORMATION_SCHEMA.COLUMNS.TABLE_NAME
,INFORMATION_SCHEMA.COLUMNS.COLUMN_NAME
FROM
INFORMATION_SCHEMA.COLUMNS
WHERE
--only get columns that are in multiple tables
INFORMATION_SCHEMA.COLUMNS.COLUMN_NAME IN
(
SELECT COLUMN_NAME FROM
(SELECT COLUMN_NAME, COUNT(COLUMN_NAME) AS ColNameCount FROM INFORMATION_SCHEMA.COLUMNS GROUP BY COLUMN_NAME) AS SubQueryColumns
WHERE ColNameCount > 1
)
--only get the table.column if not part of a foreign or primary key
EXCEPT
(
SELECT TABLE_NAME, COLUMN_NAME FROM INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE
)
) AS SubqueryAllPotentialForeignKeys
LEFT JOIN INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE AS PrimaryKeys ON
SubqueryAllPotentialForeignKeys.COLUMN_NAME = PrimaryKeys.COLUMN_NAME
--when finding possible keys for our columns that don't have references, limit to primary keys
WHERE
PrimaryKeys.CONSTRAINT_NAME LIKE '%PK_%'
ORDER BY TABLE_NAME, COLUMN_NAME