我面临着一个看似很容易的问题,但我无法解决这个问题以找到合适的解决方案。
问题:
我需要以“奇怪”(双引号中的模式)的方式将模式附加到我的 SQL 语句中。
FROM "SCHEMA".tableB tableB
LEFT JOIN "SCHEMA".tableC tableC
上下文
基本上,我们正在托管和公开一个 Metabase 工具,该工具将使用 Presto SQL 连接并在我们的 Hive 数据库上执行查询。
Metabase 允许客户编写 SQL 语句和一些客户,他们只是不键入语句上的模式。今天我们为这些查询抛出错误,但我可以轻松地从 Authorization 标头中检索模式值,因为在我们的多租户产品中,模式是该用户登录的租户 ID,并且有了这些信息,我可以附加到客户 SQL 语句并避免错误。
假设客户输入了以下语句:
SELECT tableA.*
, (tableA.valorfaturado + tableA.valorcortado) valorpedido
FROM (SELECT from_unixtime(tableB.datacorte / 1000) datacorte
, COALESCE((tableB.quantidadecortada * tableC.preco), 0) valorcortado
, COALESCE((tableB.quantidade * tableC.preco), 0) valorfaturado
, tableB.quantidadecortada
FROM tableB tableB
LEFT JOIN tableC tableC
ON tableC.numeropedido = tableB.numeropedido
AND tableC.codigoproduto = tableB.codigoproduto
AND tableC.codigofilial = tableB.codigofilial
LEFT JOIN tableD tableD
ON tableD.numero = tableB.numeropedido
WHERE (CASE
WHEN COALESCE(tableB.codigofilial, '') = '' THEN
tableD.codigofilial
ELSE
tableB.codigofilial
END) = '10'
AND from_unixtime(tableB.datacorte / 1000) BETWEEN from_iso8601_timestamp('2020-07-01T03:00:00.000Z') AND from_iso8601_timestamp('2020-08-01T02:59:59.999Z')) tableA
ORDER BY datacorte
我应该将其转换为(添加“SCHEMA”):
SELECT tableA.*
, (tableA.valorfaturado + tableA.valorcortado) valorpedido
FROM (SELECT from_unixtime(tableB.datacorte / 1000) datacorte
, COALESCE((tableB.quantidadecortada * tableC.preco), 0) valorcortado
, COALESCE((tableB.quantidade * tableC.preco), 0) valorfaturado
, tableB.quantidadecortada
FROM "SCHEMA".tableB tableB
LEFT JOIN "SCHEMA".tableC tableC
ON tableC.numeropedido = tableB.numeropedido
AND tableC.codigoproduto = tableB.codigoproduto
AND tableC.codigofilial = tableB.codigofilial
LEFT JOIN "SCHEMA".tableD tableD
ON tableD.numero = tableB.numeropedido
WHERE (CASE
WHEN COALESCE(tableB.codigofilial, '') = '' THEN
tableD.codigofilial
ELSE
tableB.codigofilial
END) = '10'
AND from_unixtime(tableB.datacorte / 1000) BETWEEN from_iso8601_timestamp('2020-07-01T03:00:00.000Z') AND from_iso8601_timestamp('2020-08-01T02:59:59.999Z')) tableA
ORDER BY datacorte
仍在尝试找到仅使用presto-parser
访客 + 仪器解决方案的解决方案。另外,我知道 JSQLParser 并且我尝试过,但我总是回来尝试找到一个“简单”的解决方案,害怕 JSQLParser 将无法支持所有与标准 SQL 有点不同的 Presto/Hive 查询;
我在 GitHub 上创建了一个带有测试用例的小项目来验证..
https://github.com/genyherrera/prestosqlerror
但是对于那些不想克隆存储库的人,这里是类和依赖项:
import java.util.Optional;
import com.facebook.presto.sql.SqlFormatter;
import com.facebook.presto.sql.parser.ParsingOptions;
import com.facebook.presto.sql.parser.SqlParser;
public class SchemaAwareQueryAdapter {
// Inspired from
// https://github.com/prestodb/presto/tree/master/presto-parser/src/test/java/com/facebook/presto/sql/parser
private static final SqlParser SQL_PARSER = new SqlParser();
public String rewriteSql(String sqlStatement, String schemaId) {
com.facebook.presto.sql.tree.Statement statement = SQL_PARSER.createStatement(sqlStatement, ParsingOptions.builder().build());
SchemaAwareQueryVisitor visitor = new SchemaAwareQueryVisitor(schemaId);
statement.accept(visitor, null);
return SqlFormatter.formatSql(statement, Optional.empty());
}
}
public class SchemaAwareQueryVisitor extends DefaultTraversalVisitor<Void, Void> {
private String schemaId;
public SchemaAwareQueryVisitor(String schemaId) {
super();
this.schemaId = schemaId;
}
/**
* The customer can type:
* [table name]
* [schema].[table name]
* [catalog].[schema].[table name]
*/
@Override
protected Void visitTable(Table node, Void context) {
List<String> parts = node.getName().getParts();
// [table name] -> is the only one we need to modify, so let's check by parts.size() ==1
if (parts.size() == 1) {
try {
Field privateStringField = Table.class.getDeclaredField("name");
privateStringField.setAccessible(true);
QualifiedName qualifiedName = QualifiedName.of("\""+schemaId+"\"",node.getName().getParts().get(0));
privateStringField.set(node, qualifiedName);
} catch (NoSuchFieldException | SecurityException | IllegalArgumentException | IllegalAccessException e) {
throw new SecurityException("Unable to execute query");
}
}
return null;
}
}
import static org.testng.Assert.assertEquals;
import org.gherrera.prestosqlparser.SchemaAwareQueryAdapter;
import org.testng.annotations.Test;
public class SchemaAwareTest {
private static final String schemaId = "SCHEMA";
private SchemaAwareQueryAdapter adapter = new SchemaAwareQueryAdapter();
@Test
public void testAppendSchemaA() {
String sql = "select * from tableA";
String bound = adapter.rewriteSql(sql, schemaId);
assertEqualsFormattingStripped(bound,
"select * from \"SCHEMA\".tableA");
}
private void assertEqualsFormattingStripped(String sql1, String sql2) {
assertEquals(sql1.replace("\n", " ").toLowerCase().replace("\r", " ").replaceAll(" +", " ").trim(),
sql2.replace("\n", " ").toLowerCase().replace("\r", " ").replaceAll(" +", " ").trim());
}
}
<dependencies>
<dependency>
<groupId>com.facebook.presto</groupId>
<artifactId>presto-parser</artifactId>
<version>0.229</version>
</dependency>
<dependency>
<groupId>org.testng</groupId>
<artifactId>testng</artifactId>
<version>6.10</version>
<scope>test</scope>
</dependency>
</dependencies>
PS:我能够在没有双引号的情况下添加架构,但我遇到了identifiers must not start with a digit; surround the identifier with double quotes
错误。基本上这个错误来自SqlParser$PostProcessor.exitDigitIdentifier(...)
方法..
谢谢