输入文件
(userid,movie,rating)
1,250,3.0
1,20,3.4
1,90,2
2,30,3.5
2,500,2.3
2,20,3.3
我应该得到用户评分最高的电影。我完全迷失了,我让程序在 hadoop 上运行,但我对 scala 是全新的。它是逗号分隔的。
到目前为止,我已经到了这里,但我无法正确解析该行。
val inputfile = sc.textFile("/home/input/input.txt") val keyval = inputfile.map(x=>(x(0),x(1))) .reduceByKey{case (x, y) => (x._1+y._1, math.max(x._2,y._2))} keyval.maxBy { case (key, value) => value } keyval.saveAsTextFile("/home/out/word")
我收到这些错误 -
<console>:26: error: value _1 is not a member of Char keyval.reduceByKey{case (x, y) => (x._1+y._1, math.max(x._2,y._2))} ^ <console>:26: error: value _1 is not a member of Char keyval.reduceByKey{case (x, y) => (x._1+y._1,math.max(x._2,y._2))} ^ <console>:26: error: value _2 is not a member of Char keyval.reduceByKey{case (x, y) => (x._1+y._1,math.max(x._2,y._2))} ^ <console>:26: error: value _2 is not a member of Char keyval.reduceByKey{case (x, y) => (x._1+y._1,math.max(x._2,y._2))} ^ <console>:26: error: value maxBy is not a member of org.apache.spark.rdd.RDD[(Char, Char)] keyval.maxBy { case (key, value) => value }