1

我有一个有趣的问题,但无法解决。请帮忙!!!!

它们是桌子

t_employee
   ID             NUMBER,
  DEPARTMENT_ID  NUMBER,
  CHIEF_ID       NUMBER,
  NAME           VARCHAR2(100 BYTE),
  SALARY         NUMBER,
  BIRTH_DATE     DATE,
  ADDRESS        VARCHAR2(200 BYTE),
  STATUS         VARCHAR2(1 BYTE)

t_department
  ID    NUMBER,
  NAME  VARCHAR2(100 BYTE)

需要显示每个区域的员工数量 - 在地址列中(如果他们现在是区域,则 = '否' 区域)。转换为大写的区域名称。

什么是问题?问题是地址列具有非结构化数据,例如:地址:

Country,REGION,city,... 

因此 REGION 必须始终位于第一个 (,) 和第二个 (,) 之间,并且必须包含单词 (reg) 例如:

Russia(Country), reg Moskovskay ,  Moscow(city), Lenina, (street)  .... or 
Russia(Country), Moskovskay reg ,  Moscow(city), Lenina, (street)  .... or

分隔符是 (,) 位置是 - 第二

非常感谢!

4

4 回答 4

1

自由格式的字符串在数据库中很少是一个好主意,这个查询将无法使用索引,这很可能会使其执行缓慢;

WITH a AS ( SELECT TRIM(
                    REPLACE(
                     UPPER(
                      REGEXP_SUBSTR(ADDRESS, ',([^,]*),', 1, 1, 'i', 1)
                     ), 
                     ' REG ', ''
                    )
                   ) REGION
            FROM t_employee)
SELECT REGION, COUNT(*) cnt FROM a GROUP BY REGION

一个用于测试的 SQLfiddle

于 2013-10-25T07:33:35.940 回答
1

试试下面的:

SELECT regexp_substr(address, ',(.*?reg.*?),', 1, 1, null, 1) AS region, COUNT(*)
FROM t_employee
GROUP BY regexp_substr(address, ',(.*?reg.*?),', 1, 1, null, 1);

但是,我强烈建议在表加载之前或期间重构架构并将地址分解为街道、城市、区域等的单独字段,前提是您有可能这样做。

于 2013-10-25T07:26:00.247 回答
0
WITH t_employee AS (
   SELECT 1 AS id, 10 AS department_id, 'a' AS name, 'kws, aaa reg, skdir, 23049' AS address FROM dual
   UNION ALL SELECT 2, 10, 'b', 'slkx, aaa reg, lskdj, 902349' FROM dual
   UNION ALL SELECT 3, 20, 'c', 'lskj, bbb reg, lskdi, 489308' FROM dual
   UNION ALL SELECT 4, 10, 'd', 'lskj, aaa reg, lskdi, 489308' FROM dual
   UNION ALL SELECT 5, 20, 'e', 'lskj, ccc reg, lskdi, 489308' FROM dual
   UNION ALL SELECT 6, 30, 'f', 'lskj, bbb reg, lskdi, 489308' FROM dual
)
, t_region AS (
SELECT id,
       TRIM (
             REPLACE (
                       SUBSTR (address,
                               INSTR (address, ',', 1) + 1,
                               INSTR (address, ',', INSTR(address, ',', 1) + 1)
                                                  - INSTR(address, ',', 1) - 1),
                      'reg',
                      '')
            )
       AS region
  FROM t_employee e
)
  SELECT r.region, count(*) AS employees
    FROM t_region r
GROUP BY r.region
;
于 2013-10-25T07:36:39.813 回答
0

一开始,请探索并更改数据库设计以分隔这些字段。从长远来看,它将对您有所帮助。如果您仍想坚持这种结构,您可以在插入时以所需的格式管理数据本身。希望有帮助!

于 2013-10-25T07:20:16.857 回答