0

问题:我如何在文件编号中找到与其他文件编号中的材料至少匹配 X 百分比(例如 >=50%)的材料?

创建表:

CREATE COLUMN TABLE "SCHEMA"."MYTABLE" 
(
     "DOCUMENT" NVARCHAR(10) DEFAULT '' NOT NULL ,
     "POSNR" NVARCHAR(6) DEFAULT '000000' NOT NULL ,
     "MATERIAL" NVARCHAR(40) DEFAULT '' NOT NULL,
PRIMARY KEY (
     "DOCUMENT",
     "POSNR")
     ) UNLOAD PRIORITY 5 AUTO MERGE 
;

插入数据:

INSERT INTO MYTABLE VALUES (100, '10', 'R3');
INSERT INTO MYTABLE VALUES (100, '20', '7000000');
INSERT INTO MYTABLE VALUES (100, '30', '7000010');
INSERT INTO MYTABLE VALUES (100, '40', '7000011');
INSERT INTO MYTABLE VALUES (100, '50', '7000160');

INSERT INTO MYTABLE VALUES (200, '10', 'SW');
INSERT INTO MYTABLE VALUES (200, '20', '7000000');
INSERT INTO MYTABLE VALUES (200, '30', '7000010');
INSERT INTO MYTABLE VALUES (200, '40', '7000011');
INSERT INTO MYTABLE VALUES (200, '50', '7000160');
INSERT INTO MYTABLE VALUES (200, '60', '7000036');
INSERT INTO MYTABLE VALUES (200, '70', '7000040');
INSERT INTO MYTABLE VALUES (200, '80', '7000066');
INSERT INTO MYTABLE VALUES (200, '90', '7000068');

INSERT INTO MYTABLE VALUES (300, '01', '7000160');
INSERT INTO MYTABLE VALUES (300, '11', '7000011');

INSERT INTO MYTABLE VALUES (400, '10', '7000033');
INSERT INTO MYTABLE VALUES (400, '20', '7000034');
INSERT INTO MYTABLE VALUES (400, '50', '7000068');
INSERT INTO MYTABLE VALUES (400, '60', '7000079');
4

1 回答 1

0

这确实可以在不使用游标的情况下解决。

with doc_elements 
(document, material, material_cnt)  
as  (select distinct
          document
        , material
        , count( *) OVER
            (PARTITION BY document) as MATERIAL_CNT
    from
        mytable
    )  
, matched_materials 
(document_a, material, material_b_cnt, document_b, match_cnt)  
as  (select
         side_a.document as document_a
       , side_a.material
       , side_b.material_cnt as material_a_cnt
       , side_b.document doc_b
       , count(*) OVER
            (PARTITION BY side_a.document, side_b.document) as match_cnt
    from 
                        doc_elements side_a
        left outer join doc_elements side_b
                on   side_a.material = side_b.material
                and side_a.document != side_b.document
     where 
            side_b.document IS NOT NULL
    )      
select distinct
    document_a
  --, material
  , document_b
  , material_b_cnt
  , match_cnt
  , round((100/material_b_cnt)*match_cnt, 2) as match_pct
from 
    matched_materials
order by
    document_a
  , document_b;

此语句返回:

DOCUMENT_A|DOCUMENT_B|MATERIAL_B_CNT|MATCH_CNT|MATCH_PCT|
----------|----------|--------------|---------|---------|
100       |200       |             9|        4|    44.44|
100       |300       |             2|        2|      100|

200       |100       |             6|        4|    66.67|
200       |300       |             2|        2|      100|
200       |400       |             4|        1|       25|

300       |100       |             6|        2|    33.33|
300       |200       |             9|        2|    22.22|

400       |200       |             9|        1|    11.11|

为了简单起见,我将R3SW视为常规材料。
输出仅包含具有至少一个材料匹配的文档映射(请参阅公用表表达式side_b.document IS NOT NULL中的条件)。matched_material

请注意,注释中的结果规范包含错误:
文档 400 没有匹配项,因为材料 7000068 不是文档 100 的材料的一部分。


我把这个问题作为一个提示,更广泛地写下这个解决方案,还包括对查询性能和调整选项的审查。

请参阅 https://lbreddemann.org/matchmaker/https://lbreddemann.org/matchmaker-quick-quick/


于 2020-03-02T03:15:30.203 回答