1

We're building a data warehouse in BigQuery where we generate a large amount of data marts using standard sql statements. These can be quite large and complex. To track data lineage across a chain of dependencies, we'd like to automatically parse the SQL statements and get all the output columns, matched up with the input table.column(s).

Simple example:

SELECT t1.a, t2.b, t1.a + t2.b AS c FROM table1 t1 JOIN table2 t2 ON t1.a = t2.a

Should end up giving us:

Input Output table1.a a table2.b b table1.a c table1.b c

We've tried using this: https://www.npmjs.com/package/node-sql-parser, but it comes up short in some of our complex scenarios.

Is there any library available in any language which supports parsing a SQL statement and returning the AST for the full standard SQL grammar?

4

1 回答 1

0

您可以使用google/zetasql,这是 BigQuery 用于解析 StandardSQL 的工具。

于 2020-01-08T09:05:24.397 回答