1

我有这个查询:

SELECT 
    c.ID, c.Firstname, c.lastname, c.BDaY, c.gender, 
    cp.code, cp.Citizenship, r.race, e.ethnicity 
FROM 
    Client AS C (nolock) 
JOIN
    Citizenship AS cp (nolock) ON c.ID = cp.client_ID
JOIN
    Race AS r (nolock) ON c.ID = R.Client_ID 
JOIN 
    Ethnicity AS E (nolock) ON E.Client_ID = c.ID

此查询将返回一些重复的客户姓名,因为他们有不同的种族和民族。

例子:

    ID |FirstName|Lastname|  BDay    | gender | code |citizenship|    race    |    ethnicity 
    1   Pedram    Salamati 01-20-1998    M      1     US citizen   Middle-east     Spanish
    1   Pedram    Salamati 01-20-1998    M      1     US Citizen   Middle-east     unknown
    1   Pedram    Salamati 01-20-1998    M      1     US Citizen   Middle-east     Brazilian
    2   Jesse     Albert   03-05-1982    F      1     US Citizen   African         not Spanish
    2   Jesse     Albert   03-05-1982    F      1     US Citizen   American        not Spanish

我想知道是否有任何方法可以说如果种族不是=种族应该是多种族的,如果种族不是=彼此相同的ID选择最后更新的。

PS.Ethnicity有时间戳,我可以使用Max(e.LastUpdate)

我想也许一个子查询可以提供帮助!

任何帮助或想法将不胜感激!

4

1 回答 1

1

这里有一些测试数据来模拟你未来的环境,你应该将所涉及的表和测试数据分开。包括 DML 语句也是适当且有用的,因此人们可以在回答之前尝试他们的解决方案。

DECLARE @Client AS TABLE (ID INT, Firstname VARCHAR(25), LastName VARCHAR(25), BDay DATE, Gender CHAR(1))
INSERT INTO @Client VALUES (1,'Pedram','Salamati','01-20-1998','M')
,(2,'Jesse','Albert','03-05-1982','F')
DECLARE @Citizenship AS TABLE (Client_ID INT, Code INT, Citizenship VARCHAR(100))
INSERT INTO @Citizenship VALUES (1,1,'US citizen'),(2,1,'US citizen')
DECLARE @Ethnicity AS TABLE (Client_ID INT, Ethnicity VARCHAR(50))
INSERT INTO @Ethnicity VALUES (1,'Spanish'),(1,'unknown'),(1,'Brazilian'),(2,'not Spanish')
DECLARE @Race AS TABLE (Client_Id INT, Race VARCHAR(50), LastUpdate DATETIME)
INSERT INTO @Race VALUES (1,'Middle-east',GETDATE()),(2,'African',GETDATE()),(2,'American',GETDATE() -1)

使用这些变量,您可以执行以下操作,当然有不止一种方式,这只是我选择的一种方式,原因如下:

;WITH cteEthnicity AS (
    SELECT
       e.Client_ID
       ,CASE WHEN COUNT(DISTINCT e.Ethnicity) > 1 THEN 'Multiracial' ELSE MIN(e.Ethnicity) END as Ethnicity
    FROM
       @Ethnicity e
    GROUP BY
       e.Client_ID
)

, cteRace AS (
    SELECT
       r.Client_Id
       ,r.Race
       ,ROW_NUMBER() OVER (PARTITION BY r.Client_Id ORDER BY r.LastUpdate DESC) as RowNumber
    FROM
       @Race r
)

SELECT
    c.ID
    ,c.Firstname
    ,c.lastname
    ,c.BDaY
    ,c.gender
    ,cp.code
    ,cp.Citizenship
    ,r.race
    ,e.ethnicity
From
    @Client AS C --(nolock) 
    Join @Citizenship as cp --(nolock)
    on  c.ID = cp.client_ID
    Join cteRace as r --(nolock)
    ON c.ID = R.Client_ID
    AND r.RowNumber = 1
    Join cteEthnicity as E --(nolock)
    ON E.Client_ID = c.ID

You displayed 2 issue 1 with race and 1 with ethnicity

  • For Ethnicity: you want to use aggregation to determine which ethnicity to assign. this can also be done with a window function but the way I wrote it here it will account for duplicates to exist even in the Ethnicity table.

  • For Race: you simply want the latest row partitioned by client you can use the ROW_NUMBER() function to generate that and then select where it equals 1 in the join statement

A third issue that you didn't point out but could be possible in some countries anyway is DUAL CITIZENSHIP. In that case you could use a method similar to that of Race.

Note even though Common Table Expressions [CTE] are used you can actually nest those as subselect as well.

于 2016-09-20T23:10:51.447 回答