azure - Azure ML Studio 是否支持将 Excel 文件作为数据集导入？

Question

我正在使用 Azure ML Studio 并尝试将 excel 文件作为数据集上传。但是，我没有选择它。我错过了什么吗？

score 0 · Accepted Answer

听起来您想在Execute Python ScriptAzure 机器学习工作室的一个实验模块中读取 Excel 文件。根据官方文件 [ Execute Python machine learning scripts in Azure Machine Learning Studio][1]，有两种方法可以做到这一点，如下所示。

若要将 Excel 文件上传到 Azure Blob 存储，请按照该部分Accessing Azure Storage Blobs使用适用于 Python 的 Azure Blob 存储 SDK 来阅读它。
请参阅将Importing existing Python script modulesExcel 文件与其他所需的 Python 包打包为 zip 文件的部分，然后Script Bundle通过 Azure ML Stodio 自动提取从 zip 文件命名的目录中读取它。

作为参考，我将向您展示第二种解决方案的详细步骤，如下所示。

我准备了一个名为的excel文件test.xlsx，内容如下。
从它的 PyPi.org 页面下载xlrd包文件xlrd-1.2.0-py2.py3-none-any.whl，然后将这些压缩文件解压缩到一个目录，然后将它们压缩test成test.xlsx一个 zip 文件test.zip，如下所示。
我将 zip 文件test.zip作为数据集上传到 Azure ML Studio，并使用Execute Python Script模块进行组装。

这是我的示例代码。我尝试使用os.getcwd(), os.listdir(),os.listdir('Script Bundle')和日志来找到读取 zip 文件中文件的正确路径。

import pandas as pd

def azureml_main(dataframe1 = None, dataframe2 = None):
    import os
    print(os.getcwd())
    print(os.listdir())
    print(os.listdir('Script Bundle'))

    import xlrd
    file = 'Script Bundle/test.xlsx'
    data = xlrd.open_workbook(file)
    print([sheet.name for sheet in data.sheets()])

    print('Input pandas.DataFrame #1:\r\n\r\n{0}'.format(dataframe1))

    return dataframe1,

它适用于Anaconda 4.0/Python 3.5，日志如下。

希望能帮助到你。

azure - Azure ML Studio 是否支持将 Excel 文件作为数据集导入？

1 回答 1

Related

Reference