我正在尝试使用气流提供的 databricks 内置运算符,例如 DatabricksSubmitRunOperator 或 DatabricksRunNowOperator 但我无法使用它,它给出以下错误:
尝试 1 次对 Databricks 的 API 请求失败,原因是:HTTPSConnectionPool(host='usdev.databaricks.xyz.com', port=443):最大重试次数超过了 url:/api/2.0/jobs/run-now(由 SSLError 引起( SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] 证书验证失败:证书链中的自签名证书 (_ssl.c:1091)'))
如果我要使用自定义 python 代码,那么我可以在 python 的请求模块中提供 verify = False,但由于我使用的是这个内置运算符,我看不到任何选项来禁用从气流到数据块的 ssl 验证。
正在使用的示例代码:
from airflow import DAG
from airflow.operators.dummy import DummyOperator
from airflow.providers.databricks.operators.databricks import (
DatabricksRunNowOperator,
)
from datetime import datetime, timedelta
"""
SUCCESS SCENARIO.
- It will run the job_id=14, that is already created in databricks.
- Basically, it will trigger what that job is supposed to do (that is, run a notebook)
"""
# Define params for Run Now Operator
notebook_params = {"Variable": 5}
with DAG(
"databricks_dag_run_now",
start_date=datetime(2021, 1, 1),
schedule_interval="@daily",
catchup=False,
default_args={
"email_on_failure": False,
"email_on_retry": False,
"retry_delay": timedelta(minutes=2),
},
) as dag:
t0 = DummyOperator(
task_id='start'
)
opr_run_now = DatabricksRunNowOperator(
task_id="run_now",
databricks_conn_id="custom_databricks_conn",
job_id=12345,
notebook_params=notebook_params,
)
t0 >> opr_run_now