我正在使用 AzureR 包从 RStudio 访问 Azure Data Lake Storage。我使用以下脚本设置连接:
library(AzureRMR)
library(AzureStor)
# setup connections
az <- az_rm$new(tenant="my_tenant_id",
app="my_app_id",
password="my_password")
sub <- az$get_subscription("my_subscription_id")
rg <- sub$get_resource_group("my_resource_group_name")
stor <- rg$get_resource(type="Microsoft.Storage/storageAccounts",
name="my_datalake_account_name")
stor$do_operation("listKeys", http_verb="POST")
连接运行良好,我得到以下结果:
attr(,"status")
[1] 200
然后我使用以下脚本将文件上传到 ADLS 文件系统并从其下载文件:
fs <- adls_filesystem(
"https://my_datalake_account_name.dfs.core.windows.net/my_file_system_name",
key="my_key"
)
# create new directory
create_adls_dir(fs, "/newdir")
upload_adls_file(
fs, src = "I:/lookup.csv",
dest = "/newdir/lookup.csv"
)
download_adls_file(
fs, src = "/newdir/lookup.csv",
dest = "J:/lookup.csv"
)
上传效果很好,而下载显示以下错误:
Connection error, retrying (1 of 10)
Connection error, retrying (2 of 10)
Connection error, retrying (3 of 10)
Connection error, retrying (4 of 10)
Connection error, retrying (5 of 10)
Connection error, retrying (6 of 10)
Connection error, retrying (7 of 10)
Connection error, retrying (8 of 10)
Connection error, retrying (9 of 10)
Connection error, retrying (10 of 10)
Error in curl::curl_fetch_memory(url, handle = handle) :
Send failure: Connection was reset
现在我有两台服务器可以使用,但目标是切换到新服务器并淘汰旧服务器。该脚本在旧服务器上运行良好,CSV 文件非常小,因此上传和下载在几秒钟内完成。但是,在新服务器上,上传工作正常,而下载失败。关于可能导致此问题的任何想法?我想知道两台服务器之间是否有任何不同的系统设置,但我对数据湖真的很陌生。任何帮助将不胜感激!