我是 spark-nlp 的初学者,我正在通过johnsnowlabs中的示例来学习它。我在数据块中使用 SCALA。
当我按照以下示例进行操作时,
import com.johnsnowlabs.nlp.base._
import com.johnsnowlabs.nlp.annotator._
import org.apache.spark.ml.Pipeline
val documentAssembler = new DocumentAssembler().
setInputCol("text").
setOutputCol("document")
val regexTokenizer = new Tokenizer().
setInputCols(Array("sentence")).
setOutputCol("token")
val sentenceDetector = new SentenceDetector().
setInputCols(Array("document")).
setOutputCol("sentence")
val finisher = new Finisher()
.setInputCols("token")
.setIncludeMetadata(true)
finisher.withColumn("newCol", explode(arrays_zip($"finished_token", $"finished_ner")))
运行最后一行时出现以下错误:
command-786892578143744:2: error: value withColumn is not a member of com.johnsnowlabs.nlp.Finisher
finisher.withColumn("newCol", explode(arrays_zip($"finished_token", $"finished_ner")))
这可能是什么原因?
当我尝试做这个例子时,通过省略这一行,我添加了以下额外的代码行
val pipeline = new Pipeline().
setStages(Array(
documentAssembler,
sentenceDetector,
regexTokenizer,
finisher
))
val data1 = Seq("hello, this is an example sentence").toDF("text")
pipeline.fit(data1).transform(data1).toDF("text")
运行最后一行时出现另一个错误:
java.lang.IllegalArgumentException: requirement failed: The number of columns doesn't match.
谁能帮我解决这个问题?
谢谢