0

相关问题:Terraform Databricks AWS 实例配置文件-“未为提供者配置身份验证”

在解决了该问题中的错误并继续之后,我开始在多个不同的操作(创建 databricks 实例配置文件、查询 terraform databricks 数据源等)中遇到以下databricks_current_user错误databricks_spark_version

Error: cannot create instance profile: Databricks API (/api/2.0/instance-profiles/add) requires you to set `host` property (or DATABRICKS_HOST env variable) to result of `databricks_mws_workspaces.this.workspace_url`. This error may happen if you're using provider in both normal and multiworkspace mode. Please refactor your code into different modules. Runnable example that we use for integration testing can be found in this repository at https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs/guides/aws-workspace

我能够在 Databricks 工作区管理控制台中手动创建实例配置文件,并且能够在其中创建集群并运行笔记本。

相关代码:


main.tf:
module "create-workspace" {
  source = "./modules/create-workspace"

  env     = var.env
  region  = var.region
  databricks_host = var.databricks_host
  databricks_account_username = var.databricks_account_username
  databricks_account_password = var.databricks_account_password
  databricks_account_id = var.databricks_account_id
}

providers-main.tf:
terraform {
  required_version = ">= 1.1.0"

    required_providers {
        databricks = {
            source  = "databrickslabs/databricks"
            version = "0.4.4"
        }
        aws = {
            source = "hashicorp/aws"
            version = ">= 3.49.0"
        }
    }
}

provider "aws" {
  region = var.region
  profile = var.aws_profile
}

provider "databricks" {
  host  = var.databricks_host
  token = var.databricks_manually_created_workspace_token
}

modules/create-workspace/providers.tf:
terraform {
  required_version = ">= 1.1.0"

    required_providers {
        databricks = {
            source  = "databrickslabs/databricks"
            version = "0.4.4"
        }
        aws = {
            source = "hashicorp/aws"
            version = ">= 3.49.0"
        }
    }
}

provider "aws" {
  region = var.region
  profile = var.aws_profile
}

provider "databricks" {
  host  = var.databricks_host
  # token = var.databricks_manually_created_workspace_token - doesn't make a difference switching from username/password to token
  username = var.databricks_account_username
  password = var.databricks_account_password
  account_id = var.databricks_account_id
}

provider "databricks" {
  alias    = "mws"
  # host     = 
  username = var.databricks_account_username
  password = var.databricks_account_password
  account_id = var.databricks_account_id
}

modules/create-workspace/databricks-workspace.tf:
resource "databricks_mws_credentials" "this" {
  provider         = databricks.mws
  account_id       = var.databricks_account_id
  role_arn         = aws_iam_role.cross_account_role.arn
  credentials_name = "${local.prefix}-creds"
  depends_on       = [aws_iam_role_policy.this]
}

resource "databricks_mws_workspaces" "this" {
  provider        = databricks.mws
  account_id      = var.databricks_account_id
  aws_region      = var.region
  workspace_name  = local.prefix
  deployment_name = local.prefix

  credentials_id           = databricks_mws_credentials.this.credentials_id
  storage_configuration_id = databricks_mws_storage_configurations.this.storage_configuration_id
  network_id               = databricks_mws_networks.this.network_id

}

modules/create-workspace/IAM.tf:
data "databricks_aws_assume_role_policy" "this" {
  external_id = var.databricks_account_id
}

resource "aws_iam_role" "cross_account_role" {
  name               = "${local.prefix}-crossaccount"
  assume_role_policy = data.databricks_aws_assume_role_policy.this.json
}

resource "time_sleep" "wait" {
  depends_on = [
  aws_iam_role.cross_account_role]
  create_duration = "10s"
}

data "databricks_aws_crossaccount_policy" "this" {}

resource "aws_iam_role_policy" "this" {
  name   = "${local.prefix}-policy"
  role   = aws_iam_role.cross_account_role.id
  policy = data.databricks_aws_crossaccount_policy.this.json
}

data "aws_iam_policy_document" "pass_role_for_s3_access" {
  statement {
    effect    = "Allow"
    actions   = ["iam:PassRole"]
    resources = [aws_iam_role.cross_account_role.arn]
  }
}

resource "aws_iam_policy" "pass_role_for_s3_access" {
  name   = "databricks-shared-pass-role-for-s3-access"
  path   = "/"
  policy = data.aws_iam_policy_document.pass_role_for_s3_access.json
}

resource "aws_iam_role_policy_attachment" "cross_account" {
  policy_arn = aws_iam_policy.pass_role_for_s3_access.arn
  role       = aws_iam_role.cross_account_role.name
}

resource "aws_iam_instance_profile" "shared" {
  name = "databricks-shared-instance-profile"
  role = aws_iam_role.cross_account_role.name
}

resource "databricks_instance_profile" "shared" {
  instance_profile_arn = aws_iam_instance_profile.shared.arn
  depends_on = [databricks_mws_workspaces.this]
}

4

1 回答 1

1

在这种情况下,问题在于您需要有两个 Databricks 提供程序:

  1. 用于配置 Databricks 工作区本身 - 它使用帐户 ID、用户名和密码
  2. 用于在 Databricks 工作空间内配置资源 - 它使用主机和令牌

这些提供者之一需要用别名声明,以便 Terraform 可以将它们与另一个区分开来。Databricks 提供者的文档显示了如何做到这一点。但问题是 Terraform 尝试尽可能并行应用所有更改,因为它不知道资源之间的依赖关系,直到您明确使用depends_on,并尝试在知道 Databricks 工作区的主机值之前创建 Databricks 资源(即使它已经创建)。

不幸的是,无法放入depends_on提供程序块。因此,目前避免此类问题的建议是将代码拆分为几个模块:

  1. 创建 Databricks 工作区并返回主机和令牌的模块
  2. 使用从接收到的主机/令牌初始化的提供程序创建 Databricks 对象的模块

此外,Terraform 文档建议不要在模块中初始化提供程序 - 最好在顶级模板中声明所有具有别名的提供程序,并将提供程序显式传递给模块(参见下面的示例)。在这种情况下,模块应该只有所需模块的声明,而不是它们的配置。

例如,顶级模板可能如下所示:

terraform {
  required_version = ">= 1.1.0"

    required_providers {
        databricks = {
            source  = "databrickslabs/databricks"
            version = "0.4.5"
        }
    }
}

provider "databricks" {
  host  = var.databricks_host
  token = var.token
}

provider "databricks" {
  alias    = "mws"
  host     = "https://accounts.cloud.databricks.com"
  username = var.databricks_account_username
  password = var.databricks_account_password
  account_id = var.databricks_account_id
}

module "workspace" {
  source    = "./workspace"
  providers = {
    databricks = databricks.workspace  
}}


module "databricks" {
  depends_on = [ module.workspace ]
  source    = "./databricks"
  # No provider block required as we're using default provider
}

和模块本身是这样的:

terraform {
  required_version = ">= 1.1.0"

    required_providers {
        databricks = {
            source  = "databrickslabs/databricks"
            version = ">= 0.4.4"
        }
    }
}

resource "databricks_cluster" {
...
}
于 2022-01-21T07:36:10.653 回答