1

我正在尝试从我们的 Firebase 分析数据集中运行计划的 DML 查询。(背景故事,我们有很多不再需要的额外事件,暂时从 BigQuery 中删除它们更容易)。

我想我也许可以运行如下查询:


DELETE FROM  `my-project.analytics_1234567890.*`
    WHERE  _TABLE_SUFFIX 
    BETWEEN CONCAT('events_', format_date('%Y%m%d', date_sub(CURRENT_DATE, interval 7 day))) 
        AND CONCAT('events_', format_date('%Y%m%d', CURRENT_DATE))
     AND event_name IN ('not a real event')

但是,BigQuery 抱怨此错误:DML over table my-project.analytics_1234567890.* is not supported

有没有办法在计划的 DML 查询中查询动态表名,以便我可以从最后 7 个分区中删除特定行?

4

1 回答 1

1

我写了一篇关于如何使用计划调用解决重复查询、使用 Cloud Workflows 将简单查询与复杂 DML 语句相结合的文章。

使用 Cloud Workflows 自动执行 BigQuery 查询

基本上它涵盖了您的用例:

针对您的问题的工作流程步骤

  1. 运行查询以获取表名
  2. 循环为每个表启动一个任务
  3. 对每个结果执行参数化任务

工作流.yaml

#workflow entrypoint 
main:
  steps:
    - getList:
        call: BQ_Query
        args:
          query: select distinct _TABLE_SUFFIX as table_id FROM  `my-project.analytics_242990349.*`
                    WHERE  _TABLE_SUFFIX 
                    BETWEEN CONCAT('events_intraday_', format_date('%Y%m%d', date_sub(CURRENT_DATE, interval 7 day))) 
                        AND CONCAT('events_intraday_', format_date('%Y%m%d', CURRENT_DATE))
        result: items
    - loopItems:
        call: BQ_Results_LoopItems
        args:
          items: ${items.rows}
        result: res
    - final:
        return: ${res}
BQ_Query:
    params: [query]
    steps:
      - runBQquery: 
          call: googleapis.bigquery.v2.jobs.query
          args:
              projectId: ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")}
              body:
                  useLegacySql: false
                  query: ${query}
          result: queryResult
      - documentFound:
          return: ${queryResult}
BQ_Results_LoopItems:
  params: [items]
  steps:
    - init:
        assign:
          - i: 0
          - result: ""
    - check_condition:
        switch:
          - condition: ${len(items) > i}
            next: iterate
        next: exit_loop
    - iterate:
        steps:
          - process_item:
              call: BQ_Task
              args:
                table_id: ${items[i].f[0].v}
              result: result
          - assign_loop:
              assign:
                - i: ${i+1}
        next: check_condition
    - exit_loop:
        return: ${result}
BQ_Task:
  params: [table_id]
  steps:
    - delete:
        call: BQ_Query
        args:
          query: ${"DELETE FROM `my-project.analytics_242990349."+table_id+"`
                    WHERE event_name IN ('not a real event')"}
        result: queryResult
    - documentFound:
        return: ${queryResult}
于 2021-05-18T16:47:33.127 回答