1

I have configured the Databricks cli locally and able to connect to the Azure Databricks cluster. Link reference used for my workstation - git

  • Below command list the jobs successfully with the id
$ databricks jobs list --profile dev

Say if I wanted to update only the schedule (cron expression) to a specific job which is already deployed in workspace, i don't see any option to do it using databricks CLI.

Note: In my case the the jobs are created using job definition json, is used to create the jobs in the cluster. This json doesn't have the schedule info to start with.

Is there are any options available to update only schedule, after the job is created or deployed in the workspace?

There is an option to run the command immediately, databricks jobs run-now.

The REST API configuration https://docs.databricks.com/dev-tools/api/latest/jobs.html#operation/JobsCreate

4

1 回答 1

0

Databricks 作业 CLI 支持调用两个版本的 Databricks 作业 REST API:版本 2.1 和 2.0。2.1 版增加了对多任务作业编排的支持;请参阅具有多个任务的作业作业 API 更新

浏览我看到的 MS 文档,您可以使用更新请求来更改现有工作。

databricks jobs update --job-id 246 --json-file update-job.json

输入 json 文件中的调度信息块。

"schedule": {
      "quartz_cron_expression": "0 0 0 * * ?",
      "timezone_id": "US/Pacific",
      "pause_status": "UNPAUSED"
    }

请参阅:MS DOC - 表示 API 2.0 的单任务格式作业的示例 JSON 文档:

{
  "job_id": 27,
  "settings": {
    "name": "Example notebook",
    "existing_cluster_id": "1201-my-cluster",
    "libraries": [
      {
        "jar": "dbfs:/FileStore/jars/spark_examples.jar"
      }
    ],
    "email_notifications": {},
    "timeout_seconds": 0,
    "schedule": {
      "quartz_cron_expression": "0 0 0 * * ?",
      "timezone_id": "US/Pacific",
      "pause_status": "UNPAUSED"
    },
    "notebook_task": {
      "notebook_path": "/notebooks/example-notebook",
      "revision_timestamp": 0
    },
    "max_concurrent_runs": 1,
    "format": "SINGLE_TASK"
  },
  "created_time": 1504128821443,
  "creator_user_name": "user@databricks.com"
}
于 2021-12-14T03:38:08.887 回答