我创建了一个包含几个步骤的管道(azureml-defaults==1.23.0)。PipelineParameter
当我从工作室运行已发布的管道时,无论我在提交管道时选择什么值,databricks 步骤始终采用默认值。
parser.add_argument('--import_date', type=str, default = "2021-04-23")
....
....
import_date = PipelineParameter(name="import_date" , default_value = params.import_date)
cluster_id = PipelineParameter(name="cluster_id" , default_value = params.cluster_id)
step_type = PipelineParameter(name="step_type" , default_value = params.step_type)
churn_months = PipelineParameter(name="churnMonths" , default_value = params.churnMonths)
data_import_step = DatabricksStep(name="Databricks Data Import Step",
existing_cluster_id=str(cluster_id.default_value),
notebook_path=import_notebook_path,
notebook_params={'ChurnMonthsWidget': churn_months,
'startDateWidget' : port_start_date,
'ImportDateWidget' : import_date,
'StepTypeWidget' : step_type},
run_name='Job_Data_Import',
compute_target=databricks_compute,
allow_reuse=False)
.....
.....
.....
pipeline_steps = StepSequence(steps=[data_import_step #Step 1
,data_manipulation_step #Step 2
,data_extraction_step #Step 3
,training_data_preparation_step #Step 4
,model_training_step #Step 5
,prediction_data_preparation_step #Step 6
,prediction_step #Step 7
])
pipeline = Pipeline(workspace = ws, steps=pipeline_steps)
published_pipeline = pipeline.publish(name = params.pipeline_name,
description = params.pipeline_description)
import_date 的默认值为 2021-04-23。即使我将 import_date 参数设置为“2021-04-22”,databricks 笔记本仍将 2021-04-23 作为 import_date。
databricks 笔记本有以下小部件
today = str(date.today())
dbutils.widgets.text("ImportDateWidget", today, label = "ImportDate")
import_date = dbutils.widgets.get("ImportDateWidget")
startDate = "2019-01-01"
dbutils.widgets.text("startDateWidget", startDate, label = "startDate")
start_date = dbutils.widgets.get("startDateWidget")
ChurnMonths = 3
dbutils.widgets.text("ChurnMonthsWidget", str(ChurnMonths), label = "ChurnMonths")
step_type = "Training"
dbutils.widgets.text("StepTypeWidget",step_type,label ="StepType" )
N_MONTHS_CHURN = int(dbutils.widgets.get("ChurnMonthsWidget"))
import_date = dbutils.widgets.get("ImportDateWidget")
start_date = dbutils.widgets.get("startDateWidget")
N_MONTHS_CHURN = int(dbutils.widgets.get("ChurnMonthsWidget"))
step_type = dbutils.widgets.get("StepTypeWidget")