1

我有一个使用 json4s 的 Spark 项目。正常提交时运行正常,但我在尝试从 spark shell 解析 JSON 时遇到错误。json4s 自述文件中最简单的示例(在项目中使用这种方式)会引发异常:

spark2-shell [options] --jars my-assembled.jar

scala> import org.json4s._
scala> import org.json4s.native.JsonMethods._

scala> parse(""" { "numbers" : [1, 2, 3, 4] } """)
<console>:30: error: overloaded method value parse with alternatives:
  (in: org.json4s.JsonInput,useBigDecimalForDouble: Boolean,useBigIntForLong: Boolean)org.json4s.JValue <and>
  (in: org.json4s.JsonInput,useBigDecimalForDouble: Boolean)org.json4s.JValue
 cannot be applied to (String)

奇怪的是,为默认提供显式参数是可行的:

scala> parse(""" { "numbers" : [1, 2, 3, 4] } """, false, true)
res2: org.json4s.JValue = JObject(List((numbers,JArray(List(JInt(1), JInt(2), JInt(3), JInt(4))))))

scala> parse(""" { "numbers" : [1, 2, 3, 4] } """, true, true)
res3: org.json4s.JValue = JObject(List((numbers,JArray(List(JInt(1), JInt(2), JInt(3), JInt(4))))))

这不会:

scala> parse(""" { "numbers" : [1, 2, 3, 4] } """, false, false)
java.lang.NoSuchMethodError: org.json4s.package$.JLong()Lorg/json4s/JsonAST$JLong$;
  at org.json4s.native.JsonParser$$anonfun$1.apply(JsonParser.scala:194)
  at org.json4s.native.JsonParser$$anonfun$1.apply(JsonParser.scala:145)
  at org.json4s.native.JsonParser$.parse(JsonParser.scala:133)
  at org.json4s.native.JsonParser$.parse(JsonParser.scala:71)
  at org.json4s.native.JsonMethods$class.parse(JsonMethods.scala:10)
  at org.json4s.native.JsonMethods$.parse(JsonMethods.scala:63)
  ... 53 elided

我还使用 Ammonite REPL 在没有 Spark 的情况下检查了它:

@ import $ivy.`org.json4s:json4s-native_2.12:3.6.10` 
@ import org.json4s._ 
@ import org.json4s.native.JsonMethods._ 
@ parse(""" { "numbers" : [1, 2, 3, 4] } """) 
res3: JValue = JObject(List(("numbers", JArray(List(JInt(1), JInt(2), JInt(3), JInt(4))))))

也许这可能是 Scala 版本的问题(在 Scala 2.11.2 上使用 Spark 2.3 和在 2.12.8 上运行的 Ammonite 示例)?我检查了 3.3.0 和 3.6.10 之间的几个 json4s 版本。

4

1 回答 1

2

这是因为二进制不兼容。

https://github.com/json4s/json4s/issues/316

Spark 2.3.0 取决于json4s-jackson_2.11-3.2.11但您可以尝试使用不兼容的json4s-native.

所以删除json4sfrom --jars, importorg.json4s.jackson.JsonMethods._而不是删除(在 json4s 3.2.11 中没有参数)org.json4s.native.JsonMethods._ 的第三个参数。parseuseBigIntForLong

然后

~/spark-2.3.0-bin-hadoop2.7/bin$ ./spark-shell --jars json4s-native_2.11-3.6.10.jar,json4s-ast_2.11-3.6.10.jar,json4s-core_2.11-3.6.10.jar,json4s-scalap_2.11-3.6.10.jar,paranamer-2.8.jar
2020-11-30 05:44:37 WARN  Utils:66 - Your hostname, dmitin-HP-Pavilion-Laptop resolves to a loopback address: 127.0.1.1; using 192.168.0.103 instead (on interface wlo1)
2020-11-30 05:44:37 WARN  Utils:66 - Set SPARK_LOCAL_IP if you need to bind to another address
2020-11-30 05:44:37 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Spark context Web UI available at http://192.168.0.103:4040
Spark context available as 'sc' (master = local[*], app id = local-1606707882568).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.3.0
      /_/
         
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit GraalVM EE 19.3.0, Java 1.8.0_231)
Type in expressions to have them evaluated.
Type :help for more information.

scala> import org.json4s._
import org.json4s._

scala> import org.json4s.native.JsonMethods._
import org.json4s.native.JsonMethods._

scala> parse(""" { "numbers" : [1, 2, 3, 4] } """)
<console>:30: error: overloaded method value parse with alternatives:
  (in: org.json4s.JsonInput,useBigDecimalForDouble: Boolean,useBigIntForLong: Boolean)org.json4s.JValue <and>
  (in: org.json4s.JsonInput,useBigDecimalForDouble: Boolean)org.json4s.JValue
 cannot be applied to (String)
parse(""" { "numbers" : [1, 2, 3, 4] } """)
^

scala> parse(""" { "numbers" : [1, 2, 3, 4] } """, false, true)
res1: org.json4s.JValue = JObject(List((numbers,JArray(List(JInt(1), JInt(2), JInt(3), JInt(4))))))

scala> parse(""" { "numbers" : [1, 2, 3, 4] } """, true, true)
res2: org.json4s.JValue = JObject(List((numbers,JArray(List(JInt(1), JInt(2), JInt(3), JInt(4))))))

scala> parse(""" { "numbers" : [1, 2, 3, 4] } """, false, false)
java.lang.NoSuchMethodError: org.json4s.package$.JLong()Lorg/json4s/JsonAST$JLong$;
  at org.json4s.native.JsonParser$$anonfun$1.apply(JsonParser.scala:194)
  at org.json4s.native.JsonParser$$anonfun$1.apply(JsonParser.scala:145)
  at org.json4s.native.JsonParser$.parse(JsonParser.scala:133)
  at org.json4s.native.JsonParser$.parse(JsonParser.scala:71)
  at org.json4s.native.JsonMethods$class.parse(JsonMethods.scala:10)
  at org.json4s.native.JsonMethods$.parse(JsonMethods.scala:63)
  ... 53 elided

将更改为

~/spark-2.3.0-bin-hadoop2.7/bin$ ./spark-shell 
2020-11-30 06:27:59 WARN  Utils:66 - Your hostname, dmitin-HP-Pavilion-Laptop resolves to a loopback address: 127.0.1.1; using 192.168.0.103 instead (on interface wlo1)
2020-11-30 06:27:59 WARN  Utils:66 - Set SPARK_LOCAL_IP if you need to bind to another address
2020-11-30 06:27:59 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Spark context Web UI available at http://192.168.0.103:4040
Spark context available as 'sc' (master = local[*], app id = local-1606710484369).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.3.0
      /_/
         
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit GraalVM EE 19.3.0, Java 1.8.0_231)
Type in expressions to have them evaluated.
Type :help for more information.

scala> import org.json4s._
import org.json4s._

scala> import org.json4s.jackson.JsonMethods._
import org.json4s.jackson.JsonMethods._

scala> parse(""" { "numbers" : [1, 2, 3, 4] } """)
res0: org.json4s.JValue = JObject(List((numbers,JArray(List(JInt(1), JInt(2), JInt(3), JInt(4))))))

scala> parse(""" { "numbers" : [1, 2, 3, 4] } """, true)
res1: org.json4s.JValue = JObject(List((numbers,JArray(List(JInt(1), JInt(2), JInt(3), JInt(4))))))

scala> parse(""" { "numbers" : [1, 2, 3, 4] } """, false)
res2: org.json4s.JValue = JObject(List((numbers,JArray(List(JInt(1), JInt(2), JInt(3), JInt(4))))))
于 2020-11-30T03:10:54.133 回答