0

我已提交在远程计算上运行的 autoML(Standard_D12_v2 - 4 节点集群 28GB,每个 4 核)

我的输入文件大约是 350 MB。

状态为“准备中”超过 2 小时。然后它失败了。

User error: Run timed out. No model completed training in the specified time. Possible solutions: 
1) Please check if there are enough compute resources to run the experiment. 
2) Increase experiment timeout when creating a run. 
3) Subsample your dataset to decrease featurization/training time. 

下面是我的 python-Notebook 代码,请帮忙。

import azureml.core
from azureml.core.experiment import Experiment
from azureml.core.workspace import Workspace
from azureml.core.dataset import Dataset
from azureml.core.compute import ComputeTarget
from azureml.train.automl import AutoMLConfig



ws = Workspace.from_config()
experiment=Experiment(ws, 'nyc-taxi')




cpu_cluster_name = "low-cluster"
compute_target = ComputeTarget(workspace=ws, name=cpu_cluster_name)


data = "https://betaml4543906917.blob.core.windows.net/betadata/2015_08.csv"
dataset = Dataset.Tabular.from_delimited_files(data)
training_data, validation_data = dataset.random_split(percentage=0.8, seed=223)
label_column_name = 'totalAmount'



automl_settings = {
    "n_cross_validations": 3,
    "primary_metric": 'normalized_root_mean_squared_error',
    "enable_early_stopping": True,
    "max_concurrent_iterations": 2, # This is a limit for testing purpose, please increase it as per cluster size
    "experiment_timeout_hours": 2, # This is a time limit for testing purposes, remove it for real use cases, this will drastically limit ablity to find the best model possible
    "verbosity": logging.INFO,
}

automl_config = AutoMLConfig(task = 'regression',
                             debug_log = 'automl_errors.log',
                             compute_target = compute_target,
                             training_data = training_data,
                             label_column_name = label_column_name,
                             **automl_settings
                            )




remote_run = experiment.submit(automl_config, show_output = False)
4

0 回答 0