我正在尝试通过 Polybase 从 Azure SQLDW 连接到 Data Lake Gen 2 中的 Parquet 文件。这是我的代码:
CREATE DATABASE SCOPED CREDENTIAL DSC_ServicePrincipal
WITH IDENTITY = '1234567890@https://login.microsoftonline.com/1234567890/oauth2/token',
SECRET = '1234567890'
GO
CREATE EXTERNAL DATA SOURCE [DS_ADLS] WITH (TYPE = HADOOP,
LOCATION = N'abfss://filesystem@storageacc.dfs.core.windows.net',
CREDENTIAL = DSC_ServicePrincipal)
GO
CREATE EXTERNAL FILE FORMAT [ParquetFileFormatSnappy]
WITH (FORMAT_TYPE = PARQUET, DATA_COMPRESSION = N'org.apache.hadoop.io.compress.SnappyCodec')
GO
CREATE EXTERNAL TABLE [dbo].[DimDate]
(
[DateSKey] int not null,
[Date] date not null,
[Year] int not null,
[Month] int not null,
[Day] int not null,
[WeekOfYear] int not null,
[MonthNameShort] varchar(50) not null,
[MonthName] varchar(50) not null,
[DayNameShort] varchar(50) not null,
[DayName] varchar(50) not null
)
WITH (DATA_SOURCE = [DS_ADLS],LOCATION = N'/PRESENTED/dimDate',FILE_FORMAT = [ParquetFileFormatSnappy],REJECT_TYPE = VALUE,REJECT_VALUE = 0)
创建外部表执行失败,返回如下错误:
访问 HDFS 时出错:调用 HdfsBridge_IsDirExist 时引发 Java 异常。Java 异常消息:HdfsBridge::isDirExist - 检查directoy 是否存在时遇到意外错误:AbfsRestOperationException: HEAD https://xxxx.dfs.core.windows.net/xxxx?resource=filesystem&timeout=90 StatusCode=403 StatusDescription=Server failed对请求进行身份验证。确保 Authorization 标头的值正确形成,包括签名。错误代码= 错误消息=
该目录确实存在,并且我的服务主体有权访问。我已经通过使用来自 Databricks 的相同服务主体并无错误地读取文件来确认这一点。
我对自己做错了什么感到迷茫。