0

我正在尝试创建 Azure DEVOPS ML 管道。以下代码在 Jupyter Notebooks 上可以 100% 正常运行,但是当我在 Azure Devops 中运行它时,我得到了这个错误:

Traceback (most recent call last):
  File "src/my_custom_package/data.py", line 26, in <module>
    ws = Workspace.from_config()
  File "/opt/hostedtoolcache/Python/3.8.7/x64/lib/python3.8/site-packages/azureml/core/workspace.py", line 258, in from_config
    raise UserErrorException('We could not find config.json in: {} or in its parent directories. '
azureml.exceptions._azureml_exception.UserErrorException: UserErrorException:
    Message: We could not find config.json in: /home/vsts/work/1/s or in its parent directories. Please provide the full path to the config file or ensure that config.json exists in the parent directories.
    InnerException None
    ErrorResponse 
{
    "error": {
        "code": "UserError",
        "message": "We could not find config.json in: /home/vsts/work/1/s or in its parent directories. Please provide the full path to the config file or ensure that config.json exists in the parent directories."
    }
}

代码是:

#import
from sklearn.model_selection import train_test_split
from azureml.core.workspace import Workspace
from azureml.train.automl import AutoMLConfig
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException
from azureml.core.experiment import Experiment
from datetime import date
from azureml.core import Workspace, Dataset



import pandas as pd
import numpy as np
import logging

#getdata
subscription_id = 'mysubid'
resource_group = 'myrg'
workspace_name = 'mlplayground'
workspace = Workspace(subscription_id, resource_group, workspace_name)
dataset = Dataset.get_by_name(workspace, name='correctData')


#auto ml
ws = Workspace.from_config()


automl_settings = {
    "iteration_timeout_minutes": 2880,
    "experiment_timeout_hours": 48,
    "enable_early_stopping": True,
    "primary_metric": 'spearman_correlation',
    "featurization": 'auto',
    "verbosity": logging.INFO,
    "n_cross_validations": 5,
    "max_concurrent_iterations": 4,
    "max_cores_per_iteration": -1,
}



cpu_cluster_name = "computecluster"
compute_target = ComputeTarget(workspace=ws, name=cpu_cluster_name)
print(compute_target)
automl_config = AutoMLConfig(task='regression',
                             compute_target = compute_target,
                             debug_log='automated_ml_errors.log',
                             training_data = dataset,
                             label_column_name="paidInDays",
                             **automl_settings)

today = date.today()
d4 = today.strftime("%b-%d-%Y")

experiment = Experiment(ws, "myexperiment"+d4)
remote_run = experiment.submit(automl_config, show_output = True)

from azureml.widgets import RunDetails
RunDetails(remote_run).show()

remote_run.wait_for_completion()
4

2 回答 2

1

您需要提供 Workspace.from_config() 的配置路径。在https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.workspace.workspace?view=azure-ml-py下,您可以找到以下说明如何创建配置文件:创建工作区:

from azureml.core import Workspace
ws = Workspace.create(name='myworkspace',
           subscription_id='<azure-subscription-id>',
           resource_group='myresourcegroup',
           create_resource_group=True,
           location='eastus2'
           )

保存工作区配置:

ws.write_config(path="./file-path", file_name="config.json")

从默认路径加载配置:

ws = Workspace.from_config()
ws.get_details()

或从指定路径加载配置:

ws = Workspace.from_config(path="my/path/config.json")

可以在此处找到有关如何从_config 创建工作区的更多详细信息: https ://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.workspace.workspace?view=azure-ml- py#from-config-path-none--auth-none---logger-none---文件名-none-

于 2021-02-05T18:57:20.480 回答
1

您的代码发生了一些奇怪的事情,您从第一个工作区 ( workspace = Workspace(subscription_id, resource_group, workspace_name)) 获取数据,然后使用第二个工作区 ( ) 的资源ws = Workspace.from_config()。我建议避免让代码依赖于两个不同的工作区,尤其是当您知道可以将底层数据源注册(链接)到多个工作区(文档)时。

config.json通常,在实例化对象时使用文件Workspace将导致交互式身份验证。当您的代码将被处理时,您将看到一个日志,要求您访问特定的 URL 并输入代码。这将使用您的 Microsoft 帐户来验证您是否有权访问 Azure 资源(在本例中为您的Workspace('mysubid', 'myrg', 'mlplayground'). 当您开始将代码部署到虚拟机或代理上时,这有其局限性,您不会总是手动检查日志、访问 URL 并进行身份验证。

对于这个问题,强烈建议设置更高级的身份验证方法,我个人建议使用服务主体,因为如果操作正确,它简单、方便且安全。您可以在此处关注 Azure 的官方文档。

于 2021-02-09T10:17:20.130 回答