我正在使用 SparkSQL 加载一堆 JSON 文件,但有些有问题。
我想继续处理其他文件,同时忽略坏文件,我该怎么做?
我尝试使用 try-catch 但它仍然失败。例子:
try {
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext._
val jsonFiles=sqlContext.jsonFile("/requests.loading")
} catch {
case _: Throwable => // Catching all exceptions and not doing anything with them
}
我失败了:
14/11/20 01:20:44 INFO scheduler.TaskSetManager: Starting task 3065.0 in stage 1.0 (TID 6150, HDdata2, NODE_LOCAL, 1246 bytes)<BR>
14/11/20 01:20:44 WARN scheduler.TaskSetManager: Lost task 3027.1 in stage 1.0 (TID 6130, HDdata2): com.fasterxml.jackson.core.JsonParseException: Unexpected end-of-input: was expecting closing quote for a string value
at [Source: java.io.StringReader@753ab9f1; line: 1, column: 1805]