1

我遵循了数据块的培训。它在 Azure 上运行,并使用以下配置构建:

构建.sbt

import AssemblyKeys._

assemblySettings

name := "movielens-als"

version := "0.1"

scalaVersion := "2.11.4"

libraryDependencies += "org.apache.spark" % "spark-mllib_2.10" % "1.2.0" % "provided"  

它可以工作并提供建议。但是
1)控制台抱怨一些代码被弃用(见下面的日志中的左箭头)。我找不到有关此问题的一些信息。
2)此外,它多次警告我缺少参数:15/03/21 14:49:51 WARN recommendation.MatrixFactorizationModel: User factor does not have a partitioner. Prediction on individual records could be slow..

安慰

C:\apps\dist\spark-1.2.0\bin>spark-submit --class MovieLensALS C:\user/app/movie
lens-als-assembly-0.1.jar /MySpark/user/data/ C:\user/personal/personalRatings.t
xt
15/03/21 14:49:19 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/03/21 14:49:19 INFO Remoting: Starting remoting
15/03/21 14:49:19 INFO Remoting: Remoting started; listening on addresses :[akka
.tcp://sparkDriver@headnode0.sparkcluster.a8.internal.cloudapp.net:60778]
15/03/21 14:49:23 INFO mapred.FileInputFormat: Total input paths to process : 1
15/03/21 14:49:24 INFO Configuration.deprecation: mapred.tip.id is deprecated. <======================= I
nstead, use mapreduce.task.id
15/03/21 14:49:24 INFO Configuration.deprecation: mapred.task.id is deprecated. <=======================
Instead, use mapreduce.task.attempt.id
15/03/21 14:49:24 INFO Configuration.deprecation: mapred.task.is.map is deprecat 
ed.  <======================= Instead, use mapreduce.task.ismap
15/03/21 14:49:24 INFO Configuration.deprecation: mapred.task.partition is depre
cated. Instead, use mapreduce.task.partition
15/03/21 14:49:24 INFO Configuration.deprecation: mapred.job.id is deprecated. I
nstead, use mapreduce.job.id
[Stage 0:>                                                          (0 + 2) / 2]
[Stage 0:=============================>                             (1 + 1) / 2]
15/03/21 14:49:24 INFO mapred.FileInputFormat: Total input paths to process : 1
[Stage 1:>                                                          (0 + 2) / 2]
[Stage 2:>                                                          (0 + 2) / 2]
[Stage 2:=============================>                             (1 + 1) / 2]
[Stage 3:>                                                          (0 + 2) / 2]
[Stage 4:>                                                          (0 + 2) / 2]
Got 1000209 ratings from 6040 users on 3706 movies.
[Stage 6:===================>                                       (1 + 2) / 3]
[Stage 7:>                                                          (0 + 4) / 4]
[Stage 8:>                                                          (0 + 0) / 2]
[Stage 8:>                                                          (0 + 2) / 2]
[Stage 10:>                                                         (0 + 2) / 2]
Training: 602252, validation: 198919, test: 199049
[Stage 12:>                                                         (0 + 4) / 4]
[Stage 12:===========================================>              (3 + 1) / 4]
[Stage 34:>                                                         (0 + 4) / 4]
[Stage 13:>                                                         (0 + 4) / 4]
[Stage 16:>                                                         (0 + 4) / 4]
[Stage 17:>                                                         (0 + 4) / 4]
15/03/21 14:49:51 WARN recommendation.MatrixFactorizationModel: User factor does
 not have a partitioner. Prediction on individual records could be slow.
15/03/21 14:49:51 WARN recommendation.MatrixFactorizationModel: Product factor d
oes not have a partitioner. Prediction on individual records could be slow.
[Stage 140:>                                                        (0 + 0) / 4]
[Stage 167:>                                                        (0 + 4) / 4]
[Stage 167:============================>                            (2 + 2) / 4]
[Stage 165:>                                                        (0 + 4) / 4]
[Stage 166:>                                                        (0 + 4) / 4]
[Stage 166:==========================================>              (3 + 1) / 4]
[Stage 168:>                                                        (0 + 4) / 4]
RMSE (validation) = 0.8694473524689862 for the model trained with rank = 8, lamb
da = 0.1, and numIter = 10.
4

0 回答 0