0

我有一个丑陋的表格,其中包含来自我无法控制的来源的代码和名称,就像这个(OriginalTable):

Code    | Name
--------------------
001-001 | Name1_a
001-002 | Name1_a
001-002 | Name1_b
001-003 | Name1_a
002-001 | Name2_a
002-001 | Name2_b
002-002 | Name2_a
003-001 | Name3
...

问题是我需要每个代码(SmallCode)的前 3 位数字的唯一名称,如下表所示:

Id  | Code  | Name
--------------------
1   | 001   | NameX
2   | 002   | NameY
3   | 003   | NameZ

我想用于选择名称的标准是它应该是每个 SmallCode 中重复次数最多的名称或第一个名称。例如,NameX 是所有以 001 或第一个开头的代码中重复次数最多的名称(在这两种情况下都是 Name1_a)。与 002 的 NameY 和 003 的 NameZ 相同。

现在我正在使用这个查询:

select Substring(Code,1,3) as SmallCode, Code, Name
into #tmpCode
from OriginalTable

select SmallCode, Min(Code) as Code
into #tmpReducedCode
from #tmpCode
group by SmallCode

insert into ResultTable (Code, Name)
select a.SmallCode, a.Name
from #tmpCode a
    inner join #tmpReducedCode b
        on a.Code = b.Code

但这是我的结果,这是错误的,因为代码 002-001 (Name2_a, Name2_b) 有 2 个不同的名称

1   | 001   | Name1_a
2   | 002   | Name2_a
3   | 002   | Name2_b
4   | 003   | Name3

所以问题是:如何将 OriginalTable 分成这两个表,为每个小代码选择最重复或第一次出现的名称?

4

3 回答 3

2

对于第一个表:

select Substring(Code,1,3) as SmallCode, Code, Name
into #tmpCode
from OriginalTable

select SmallCode, Name
into #tmpReducedCode
from (
    select SmallCode, Name, row_number() over (partition by SmallCode order by Total desc) rn
    from (
        select SmallCode, Name, count(*) Total
        from #tmpCode
        group by SmallCode, Name) x) y
where rn=1;

select distinct a.SmallCode, b.Name
from #tmpCode a
    inner join #tmpReducedCode b
        on left(a.Code,3) = b.SmallCode
于 2012-11-29T22:24:38.717 回答
1

为每个代码运行子查询:

select distinct substring(Code,1,3) as "Code", 
    (select top 1 Name
    from OrginalTable tab2
    where substring(tab2.Code,1,3)=substring(tab1.Code,1,3)
    group by substring(Code,1,3), Name 
    order by count(Name) desc) as "Name"
from OrginalTable tab1;
于 2012-11-29T22:36:29.830 回答
1

我认为最好的方法是使用窗口函数:

select cast(LEFT(code, 3) as int) as id,
       RIGHT(code, 3) as code,
       name
from (select cn.*, ROW_NUMBER() over (partition by code order by cnt desc) as seqnum
      from (select code, name, COUNT(*) as cnt
            from OriginalTable ot
            group by code, name
           ) cn
     ) cn
where seqnum = 1

这假设您使用的是 SQL Server 2005 或更新的版本。

于 2012-11-29T23:05:14.203 回答