1

为了将其导入我们的系统,我获得了一个巨大的 Excel 数据。我将它导入到 SQL 表中,以便进行所需的数据转换。我遇到了很多愚蠢的问题。我找不到解决方案的最新问题如下:

在 CompanyName 中,很多时候我的名称重复了两次(并非总是如此):

[CompanyName]
INTERDYN SA   INTERDYN SA
EARTH TOUR   EARTH TOUR
SOUNDLIGHTS JAJ CYTER

如您所见,没有模式。是否有一种巧妙的方法可以发现重复项并删除孪生公司名称?

4

1 回答 1

2

只需比较字符串的第一部分和最后一部分并检查中间字符是否为空格即可。

CREATE TABLE Companies
( 
   id int identity
 , CompanyName varchar(50)
)

INSERT INTO Companies (CompanyName) 
VALUES ('test') 
     , ('test test') 
     , ('testtest') 
     , ('testz test')

-- Just query the corrected list
SELECT CASE WHEN substring(CompanyName, LEN(CompanyName)/2+1, 1) = ' ' and substring(CompanyName, 1, LEN(CompanyName)/2) = substring(CompanyName, LEN(CompanyName)/2+2, LEN(CompanyName)) 
            THEN substring(CompanyName, 1, LEN(CompanyName)/2) 
            ELSE CompanyName 
       END
FROM Companies

-- update the incorrect values
UPDATE Companies
   SET CompanyName = substring(CompanyName, 1, LEN(CompanyName)/2) 
 WHERE substring(CompanyName, LEN(CompanyName)/2+1, 1) = ' ' 
   AND substring(CompanyName, 1, LEN(CompanyName)/2) = substring(CompanyName, LEN(CompanyName)/2+2, LEN(CompanyName))

select * from Companies

drop table Companies
于 2013-03-09T09:46:11.707 回答