1

So I'm writing a BigQuery query and basically just need to be able to check if any of a number of strings are present as elements in one of the columns of the table, where the cared-about column itself contains arrays of strings. Just for context, I'm writing the query as part of a little automated Python job and am using standard SQL.

I couldn't find anything that would explicitly check for array inclusion here: https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators

So I came up with a solution that employs a pretty hacky regex, specifically:

...other query stuff...

WHERE
    REGEXP_CONTAINS((LOWER(ARRAY_TO_STRING(column, '-'))), r"({joined_string})")

...where column is the column I care about in the table, and joined_string is a long string composed of all the strings I need to check for joined by | (where | serves as the regex OR operator).

Does there exist some kind of built-in functionality in BigQuery standard SQL that allows one to do this more sanely?

4

1 回答 1

5

下面是两个例子。

首先假设您的字符串在另一个表中 strings

#standardSQL
WITH yourTable AS (
  SELECT 1 AS id, ['abc', 'def', 'xyz'] AS column UNION ALL
  SELECT 2, ['123', '456', '789'] UNION ALL
  SELECT 3, ['135', '246', '369'] 
),
strings AS (
  SELECT 'abc' AS str UNION ALL
  SELECT '123' UNION ALL
  SELECT '456'
)
SELECT *
FROM yourTable
WHERE (SELECT COUNT(1) FROM UNNEST(column) AS col JOIN strings ON col = str) > 0  

SELECT如果您需要查看有多少字符串匹配, 您可以在下面添加到列表中

(SELECT COUNT(1) FROM UNNEST(column) AS col JOIN strings ON col = str) AS cnt

第二个示例假设您有打包在 Array 中的字符串列表

#standardSQL
WITH yourTable AS (
  SELECT 1 AS id, ['abc', 'def', 'xyz'] AS column UNION ALL
  SELECT 2, ['123', '456', '789'] UNION ALL
  SELECT 3, ['135', '246', '369'] 
),
strings AS (
  SELECT ['abc', 'def', '456'] AS strs
)
SELECT yourTable.*
FROM yourTable, strings
WHERE (SELECT COUNT(1) FROM UNNEST(column) AS col JOIN UNNEST(strs) AS str ON col = str) > 0   

与第一个示例相同 - 您可以在下面添加SELECT列表以查看匹配计数

(SELECT COUNT(1) FROM UNNEST(column) AS col JOIN UNNEST(strs) AS str ON col = str) AS cnt
于 2017-03-13T23:00:53.790 回答