1

我试图弄清楚如何编写 GQL(Google SQL)查询来过滤深度嵌套的结构,然后再次嵌套它,并将 STRUCT 属性的第一条记录保留在与 ARRAY 相同的级别。

我准备了一个模式示例

 WITH
      Sale AS (
      SELECT
        "1" AS _id,
        STRUCT("11" AS _id,
          "SERVICE" AS feedbackType,
          DATE(TIMESTAMP("2017-01-20 14:05:51.655")) AS createDate) AS serviceFeedback,
        [STRUCT("host" AS key,
          "localhost" AS value),
        STRUCT("location" AS key,
          "Paris" AS value)] AS tags,
        TRUE AS reviewed,
        [STRUCT("1" as saleId, STRUCT("101" AS _id,
            "PRODUCT" AS feedbackType,
            DATE(TIMESTAMP("2017-01-20 14:05:51.655")) AS createDate) AS productFeedback),
        STRUCT("1" as saleId, STRUCT("102" AS _id,
            "PRODUCT" AS feedbackType,
            DATE(TIMESTAMP("2017-01-20 14:06:51.655")) AS createDate) AS productFeedback) ] AS saleItems,
        DATE(TIMESTAMP("2017-01-20 14:05:51.655")) AS latestFeedbackDate )

并且需要一个扁平化所有嵌套字段的源过滤器查询来进行过滤。

SELECT
  saleId,
  serviceFeedback,
  saleTags,
  reviewed,
  saleItems,
  latestFeedbackDate
FROM (
  SELECT
    sale._id AS saleId,
    serviceFeedback,
    sale.tags AS saleTags,
    reviewed,
    saleItems,
    latestFeedbackDate
  FROM
    `Sale` AS sale,
    sale.saleItems AS saleItems
  WHERE
    reviewed = TRUE
    AND serviceFeedback.createDate >= DATE(TIMESTAMP("2017-01-18 14:05:51.655"))
    AND serviceFeedback._id IS NOT NULL
    AND saleItems.productFeedback.createDate >= DATE(TIMESTAMP("2017-01-18 14:05:51.655")))
ORDER BY
  latestFeedbackDate DESC
LIMIT
  20

主要问题是,在此过滤之后,想要对所有内容进行分组saleItemssale._id返回初始结构)并检索serviceFeedback具有 STRUCT 类型的字段。

JSON格式的预期结果是:

{
    "saleId":"1",
    "serviceFeedback":{"_id":"11","feedbackType":"SERVICE","createDate":"2017-01-20"},
    "saleTags":[{"key":"host","value":"localhost"},{"key":"location","value":"Paris"}],
    "reviewed":"true",
    "saleItems":[
        {"saleId":"1","productFeedback":{"_id":"101","feedbackType":"PRODUCT","createDate":"2017-01-20"},
        {"saleId":"1","productFeedback":{"_id":"102","feedbackType":"PRODUCT","createDate":"2017-01-20"},
    ],
    "latestFeedbackDate":"2017-01-20"
}

我写了我想到的最简单的查询想法。它产生正确的结果。但可能有可能以更有效的方式重写它,

SELECT
  saleId,
  serviceFeedback,
  latestFeedbackDate,
  subQuery.saleItems as saleItems
FROM
  sale
RIGHT JOIN (
  SELECT
    saleId,
    ARRAY_AGG(saleItems) as saleItems
  FROM (
    SELECT
      saleId,
      saleItems
    FROM (
      SELECT
        sale._id AS saleId,
        latestFeedbackDate,
        saleItems
      FROM
        `Sale` AS sale,
        sale.saleItems AS saleItems
      WHERE
        reviewed = TRUE
        AND serviceFeedback.createDate >= DATE(TIMESTAMP("2017-01-18 14:05:51.655"))
        AND serviceFeedback._id IS NOT NULL
        AND saleItems.productFeedback.createDate >= DATE(TIMESTAMP("2017-01-18 14:05:51.655")))
    ORDER BY
      latestFeedbackDate DESC)
  GROUP BY
    saleId
    ) AS subQuery
ON
  sale._id = subQuery.saleId

你能建议我一个更好的解决方案来达到同样的结果吗?

4

1 回答 1

2

你能建议我一个更好的解决方案来达到同样的结果吗?

下面生成与原始表完全相同的模式,只需将所需的过滤器应用于saleItems

#standardSQL
SELECT * REPLACE(
  ARRAY(
    SELECT saleItems FROM UNNEST(saleItems) saleItems 
    WHERE reviewed = TRUE
      AND serviceFeedback.createDate >= DATE(TIMESTAMP("2017-01-18 14:05:51.655"))
      AND serviceFeedback._id IS NOT NULL
      AND saleItems.productFeedback.createDate >= DATE(TIMESTAMP("2017-01-18 14:05:51.655"))
  ) AS saleItems)
FROM sale

如果您只需要字段的子集 - 以下面为例

#standardSQL
SELECT 
  _id saleId,
  serviceFeedback,
  ARRAY(
    SELECT saleItems FROM UNNEST(saleItems) saleItems 
    WHERE reviewed = TRUE
      AND serviceFeedback.createDate >= DATE(TIMESTAMP("2017-01-18 14:05:51.655"))
      AND serviceFeedback._id IS NOT NULL
      AND saleItems.productFeedback.createDate >= DATE(TIMESTAMP("2017-01-18 14:05:51.655"))
  ) AS saleItems
FROM sale
于 2018-01-19T15:39:20.430 回答