您可以使用 --packages 命令行选项将包添加到 Spark
根据评论和问题,您应该尝试在一行中运行代码,它将解决您的“错误:定义的非法开始”问题
val df = spark.read.format("com.databricks.spark.xml").option("rowTag", "book").load("book.xml")
下一步为“未能找到数据源:com.databricks.spark.xml。”
尝试添加库依赖项/包"com.databricks:spark-xml_2.11:0.4.1 "
spark-shell --packages com.databricks:spark-xml_2.11:0.4.1
val df = spark.read.format("com.databricks.spark.xml").option("rowTag", "book").load("book.xml")
df.show
+-----+--------------------+--------------------+---------------+-----+------------+--------------------+
| _id| author| description| genre|price|publish_date| title|
+-----+--------------------+--------------------+---------------+-----+------------+--------------------+
|bk101|Gambardella, Matthew|An in-depth look ...| Computer|44.95| 2000-10-01|XML Developer's G...|
|bk102| Ralls, Kim|A former architec...| Fantasy| 5.95| 2000-12-16| Midnight Rain|
|bk103| Corets, Eva|After the collaps...| Fantasy| 5.95| 2000-11-17| Maeve Ascendant|