-2

我想将逗号分隔的值转换为 Redshift 中的行

例如:

store  |location |products
-----------------------------
1      |New York |fruit, drinks, candy...

所需的输出是:

store  |location | products
------------------------------- 
1      |New York | fruit        
1      |New York | drinks         
1      |New York | candy     

是否有任何简单的解决方案可以根据分隔符拆分单词并转换为行?我正在研究这个解决方案,但它还不起作用:https ://help.looker.com/hc/en-us/articles/360024266693-Splitting-Strings-into-Rows-in-the-Absence-of-Table -生成函数

任何建议将不胜感激。

4

5 回答 5

0

如果您知道值的最大数量,我认为您可以split_part()

select t.store, t.location, split_part(products, ',', n.n) as product
 from t join
      (select 1 as n union all
       select 2 union all
       select 3 union all
       select 4
      ) n
      on split_part(products, ',', n.n) <> '';
 

您还可以使用:

select t.store, t.location, split_part(products, ',', 1) as product
from t 
union all
select t.store, t.location, split_part(products, ',', 2) as product
from t 
where split_part(products, ',', 2) <> ''
union all
select t.store, t.location, split_part(products, ',', 3) as product
from t 
where split_part(products, ',', 3) <> ''
union all
select t.store, t.location, split_part(products, ',', 4) as product
from t 
where split_part(products, ',', 4) <> ''
union all
. . .
于 2021-06-22T11:47:17.520 回答
0

MYSQL也不错


CREATE TABLE test
SELECT 1 store, 'New York' location, 'fruit,drinks,candy' products;

SELECT store, location, product
FROM test
CROSS JOIN JSON_TABLE(CONCAT('["', REPLACE(products, ',', '","'), '"]'),
                      "$[*]" COLUMNS (product VARCHAR(255) PATH "$")) jsontable
店铺 地点 产品
1 纽约 水果
1 纽约 饮料
1 纽约 糖果

db<>在这里摆弄

于 2021-06-22T13:20:25.617 回答
0

在 MySQL 中,这将适用于最多四个逗号分隔的值。注意UNION,不是UNION ALL小提琴

SELECT store, location,  
       TRIM(SUBSTRING_INDEX(products, ',', 1)) product
  FROM inventory
 UNION 
SELECT store, location, 
       TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(products, ',', 2), ',', -1))
  FROM inventory
 UNION 
SELECT store, location, 
       TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(products, ',', 3), ',', -1))
  FROM inventory
 UNION 
SELECT store, location, 
       TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(products, ',', 4), ',', -1))
  FROM inventory

我会回应其他人所说的话。恕我直言,逗号分隔值是一种糟糕的表格设计。

  • 它导致丑陋的 SQL。能够阅读和推理 SQL 非常重要。清晰总是赢。
  • 而且,AWS 的股东会因此而爱上你,因为你会在 redshift 上花费很多额外的钱。
于 2021-06-22T13:21:46.320 回答
0

首先,您需要创建一个数字表,因为与另一个表连接是 redshift 将一行转换为多行的唯一方法(没有扁平化或非嵌套功能)。

  • 例如,一个包含 1024 行的表,其中的值为 1..1024

然后你可以加入并使用split_part()

SELECT
  yourTable.*,
  numbers.ordinal,
  split_part(your_table.products, ',', numbers.ordinal)  AS product
FROM
  yourTable
INNER JOIN
  numbers
    ON  numbers.ordinal >= 1
    AND numbers.ordinal <= regexp_count(your_table.products, ',') + 1

但...

Redshift 在预测所需行数方面很糟糕。它将加入整个 1024 行,然后拒绝不匹配的行。

它的表现就像一条狗。

因为设计假设是这样的处理总是在加载到 Redshift 之前完成。

于 2021-06-22T11:51:59.403 回答
-1
CREATE TABLE temptbl  
(
    store INT,
    location  NVARCHAR(MAX),
    products NVARCHAR(MAX)
)



INSERT temptbl   SELECT 1,  'New York', 'Fruit, drinks, candy'

创建表时的输出

输出

select * from temptbl


;WITH tmp(store, location, DataItem, products) AS
(
    SELECT
        store,
        location,
        LEFT(products, CHARINDEX(',', products + ',') - 1),
        STUFF(products, 1, CHARINDEX(',', products + ','), '')
    FROM temptbl
    UNION all

    SELECT
        store  ,
        location,
        LEFT(products, CHARINDEX(',', products + ',') - 1),
        STUFF(products, 1, CHARINDEX(',', products + ','), '')
    FROM tmp
    WHERE
        products > ''
)

SELECT
    store,
    location,
    DataItem
FROM tmp

您希望在 Multiple Rows 中使用逗号分隔值:运行上述命令后所需的输出:

输出

希望你找到你的解决方案:)))

于 2021-06-22T12:59:53.343 回答