3

在 Azure Databricks 中的 Python 3 笔记本中,当我运行此命令时:

%scala
import com.microsoft.azure.sqldb.spark.config.Config
import com.microsoft.azure.sqldb.spark.connect._

val config = Config(Map(
  "url"            -> "serverName.database.windows.net:1433",
  "databasename"   -> "dbName",
  "user"           -> "user@domain.com",
  "password"       -> "password",
  "encrypt"        -> "true",
  "trustServerCertificate" -> "false",
  "hostNameInCertificate"  -> "*.database.windows.net",
  "loginTimeout"   -> "30",
  "authentication" -> "ActiveDirectoryPassword",
  "dbTable"        -> "dbo.TableName"
))

val collection = sqlContext.read.sqlDB(config)
collection.show()

我得到错误:

java.lang.NoClassDefFoundError: com/microsoft/aad/adal4j/AuthenticationException

此数据库需要 ActiveDirectoryPassword。我可以使用上面的凭据在我的计算机上使用 pyodbc 进行连接,但我无法从 Databricks 获得任何连接。这是一个 Azure Databricks 标准帐户(不是高级帐户)。有任何想法吗?

更新:感谢马克的回答。显然,在 Azure Databricks 中默认导入 jar 会将它们放在应用程序类路径而不是系统类路径中,这是导致此错误的原因(根据:https ://forums.databricks.com/questions/706/how -can-i-attach-a-jar-library-to-the-cluster-that.html)。为了解决这个问题,我使用了下面的代码(将“clusterName”更改为集群的实际名称):

%scala
// This code block only needs to be run once to create the init script for the cluster (file remains on restart)

// Create dbfs:/databricks/init/ if it doesn’t exist.
dbutils.fs.mkdirs("dbfs:/databricks/init/")

// Display the list of existing global init scripts.
display(dbutils.fs.ls("dbfs:/databricks/init/"))

// Create a directory named (clusterName) using Databricks File System - DBFS.
dbutils.fs.mkdirs("dbfs:/databricks/init/clusterName/")

// Create the adal4j script.
dbutils.fs.put("/databricks/init/clusterName/adal4j-install.sh","""
#!/bin/bash
wget --quiet -O /mnt/driver-daemon/jars/adal4j-1.6.0.jar http://central.maven.org/maven2/com/microsoft/azure/adal4j/1.6.0/adal4j-1.6.0.jar
wget --quiet -O /mnt/jars/driver-daemon/adal4j-1.6.0.jar http://central.maven.org/maven2/com/microsoft/azure/adal4j/1.6.0/adal4j-1.6.0.jar""", true)


// Check that the cluster-specific init script exists.
display(dbutils.fs.ls("dbfs:/databricks/init/clusterName/"))
4

0 回答 0