我创建了一个用 Python 编写的名为“transformerfunction”的 Azure 函数,它应该将数据上传和下载到 Azure 数据湖/存储。我还打开了系统分配的托管标识,并在我的存储帐户中为该功能提供了角色权限“存储 Blob 数据参与者”:
为了验证和下载文件,我使用这部分代码基本上遵循这些文档:
managed_identity = ManagedIdentityCredential()
credential_chain = ChainedTokenCredential(managed_identity)
client = DataLakeServiceClient(account_url, credential=credential_chain)
file_client = client.get_file_client(file_system_container, file_name)
downloaded_file = file_client.download_file()
downloaded_file.readinto(f)
如果我的理解是正确的,Azure 应该使用 Function 的身份进行身份验证,并且由于该身份在存储上具有 Storage Blob Data Contributor 权限,因此下载应该可以工作。
但是,当我调用该函数并查看日志时,我看到的是:
2020-11-23 20:04:11.396 Function called
2020-11-23 20:04:11.397 ManagedIdentityCredential will use App Service managed identity
2020-11-23 20:04:13.105
Result: Failure Exception: HttpResponseError: This request is not authorized to perform this operation.
RequestId:1f6a2a1c-b01e-0090-26d3-c1d0c0000000 Time:2020-11-23T20:04:13.0679405Z ErrorCode:AuthorizationFailure Error:None Stack:
File "/azure-functions-host/workers/python/3.6/LINUX/X64/azure_functions_worker/dispatcher.py", line 357, in _handle__invocation_request self.__run_sync_func, invocation_id, fi.func, args)
File "/usr/local/lib/python3.6/concurrent/futures/thread.py", line 56, in run result = self.fn(*self.args, **self.kwargs)
File "/azure-functions-host/workers/python/3.6/LINUX/X64/azure_functions_worker/dispatcher.py", line 542, in __run_sync_func return func(**params)
File "/home/site/wwwroot/shared/datalake.py", line 65, in download downloaded_file = client.download_file()
File "/home/site/wwwroot/.python_packages/lib/python3.6/site-packages/azure/storage/filedatalake/_data_lake_file_client.py", line 593, in download_file downloader = self._blob_client.download_blob(offset=offset, length=length, **kwargs)
File "/home/site/wwwroot/.python_packages/lib/python3.6/site-packages/azure/core/tracing/decorator.py", line 83, in wrapper_use_tracer return func(*args, **kwargs)
File "/home/site/wwwroot/.python_packages/lib/python3.6/site-packages/azure/storage/blob/_blob_client.py", line 674, in download_blob return StorageStreamDownloader(**options)
File "/home/site/wwwroot/.python_packages/lib/python3.6/site-packages/azure/storage/blob/_download.py", line 316, in __init__ self._response = self._initial_request()
File "/home/site/wwwroot/.python_packages/lib/python3.6/site-packages/azure/storage/blob/_download.py", line 403, in _initial_request process_storage_error(error)
File "/home/site/wwwroot/.python_packages/lib/python3.6/site-packages/azure/storage/blob/_shared/response_handlers.py", line 147, in process_storage_error raise error
这很清楚地表明该函数无权下载 blob。但为什么?我有什么不同的做法?
编辑:
我找到了问题的原因:我在网络设置中限制了我的 Data Lake 存储,如下所示:
我的假设是“允许受信任的 Microsoft 服务访问此存储帐户”将始终允许在 Azure 上运行的 Functions 访问存储,无论是否选择或选择了哪些网络 - 事实并非如此。