-1

我正在尝试规范化我的数据,因为它是从 Excel 工作表输入的。我提取数据的文件有一堆兄弟1_name、兄弟1_age、兄弟1_affected等列,最多有4个兄弟姐妹、4个孩子、4个亲戚等。我想将它们全部输入到一个新表中,其中包含名称、年龄、受影响和关系。我找到了一种正确输入第一个兄弟姐妹的方法(见下文),但我不确定如何添加其他兄弟姐妹。有什么建议吗?

INSERT INTO Family
            (ID,
             Name,
             Age,
             Affected,
             Relationship)
SELECT ExcelPatients.id,
       ExcelPatients.sibling1_name     AS Name,
       ExcelPatients.sibling1_age      AS Age,
       ExcelPatients.sibling1_affected AS Affected,
       "Sibling"
FROM   ExcelPatients
WHERE  (( ( ExcelPatients.Sibling1_name ) IS NOT NULL ))
       AND ExcelPatients.id NOT IN (SELECT DISTINCT ID  AND Name
                                    FROM   Family); 
4

1 回答 1

1
INSERT INTO Family
            (ID,
             Name,
             Age,
             Affected,
             Relationship)

SELECoT ExcelPatients.id, ExcelPatients.sibling1_name AS Name, 
ExcelPatients.sibling1_age AS Age, 
ExcelPatients.sibling1_affected AS Affected, "Sibling"
FROM ExcelPatients
WHERE (((ExcelPatients.Sibling1_name) Is Not Null))
AND NOT EXISTS  (SELECT DISTINCT ID FROM Family where family.id =  ExcelPatients.id and Family.name =  ExcelPatients.sibling1_name)

UNION

SELECT ExcelPatients.id, ExcelPatients.sibling2_name AS Name, 
ExcelPatients.sibling2_age AS Age, 
ExcelPatients.sibling2_affected AS Affected, "Sibling"
FROM ExcelPatients
WHERE (((ExcelPatients.Sibling2_name) Is Not Null))
AND NOT EXISTS  (SELECT DISTINCT ID FROM Family where family.id =  ExcelPatients.id and Family.name =  ExcelPatients.sibling2_name)

UNION

SELECT ExcelPatients.id, ExcelPatients.sibling3_name AS Name, 
ExcelPatients.sibling3_age AS Age, 
ExcelPatients.sibling3_affected AS Affected, "Sibling"
FROM ExcelPatients
WHERE (((ExcelPatients.Sibling3_name) Is Not Null))
AND NOT EXISTS  (SELECT DISTINCT ID FROM Family where family.id =  ExcelPatients.id and Family.name =  ExcelPatients.sibling3_name)

UNION 

SELECT ExcelPatients.id, ExcelPatients.sibling4_name AS Name, 
ExcelPatients.sibling4_age AS Age, 
ExcelPatients.sibling4_affected AS Affected, "Sibling"
FROM ExcelPatients
WHERE (((ExcelPatients.Sibling4_name) Is Not Null))
AND NOT EXISTS  (SELECT DISTINCT ID FROM Family where family.id =  ExcelPatients.id and Family.name =  ExcelPatients.sibling4_name)

没有看到数据,我不知道 UNION ALL 或 UNION 是否是正确的选择。如果名称只能在 4 个兄弟列之一中,则使用 UNION ALL,如果可以重复,则使用 UNION。由于您正在清理来自另一个来源的数据,因此 UNION 可能是更安全但更慢的选择。NOT EXISTS 往往是 SQL Server 中最快的比较,这就是我选择它的原因。

于 2013-02-08T18:50:27.030 回答