1

我目前正在从我们的网络服务器请求日志中删除路径。这些在数据类型字符串中,具有不同的长度,看起来像这样:

/v1/facility/username
/v1/facility/username/action?utm_parameter
/signinnext=/oauth2/authorize/utm_parameter

我想要做的是每/切这些刺并将它们排成行

它应该看起来像这样

name1      |  name2   | name3        | name4
------     | ------   | ------       | ----
v1         | facility | username     | NULL
v1         | facility | username     | action?utm_parameter
signinnext=| oauth2   | authorize    | utm_parameter

我如何在 Athena 中使用 SQL 进行编码?

4

1 回答 1

0

选项 1:split_part

select  split_part (path,'/',2) as name1
       ,split_part (path,'/',3) as name2
       ,split_part (path,'/',4) as name3
       ,split_part (path,'/',5) as name4

from    mytable
;

    name1    |  name2   |   name3   |        name4
-------------+----------+-----------+----------------------
 v1          | facility | username  | NULL
 v1          | facility | username  | action?utm_parameter
 signinnext= | oauth2   | authorize | utm_parameter

选项 2:regexp_extract

select  regexp_extract (path,'(/([^/]*)){1}',2) as name1
       ,regexp_extract (path,'(/([^/]*)){2}',2) as name2
       ,regexp_extract (path,'(/([^/]*)){3}',2) as name3
       ,regexp_extract (path,'(/([^/]*)){4}',2) as name4

from    mytable
;

    name1    |  name2   |   name3   |        name4
-------------+----------+-----------+----------------------
 v1          | facility | username  | NULL
 v1          | facility | username  | action?utm_parameter
 signinnext= | oauth2   | authorize | utm_parameter
于 2017-03-19T18:20:38.017 回答