0

当我运行 LensKit 演示程序时,我收到此错误:

[main] 错误 org.grouplens.lenskit.data.dao.DelimitedTextRatingCursor - C:\Users\sean\Desktop\ml-100k\u - Copy.data:4: 无效输入,跳行

我重新设计了 ML 100k 数据集,使它只保留这条线,尽管我不知道这会如何影响它:

196 242 3   881250949
186 302 3   891717742
22  377 1   878887116
244

这是我也在使用的代码:

public class HelloLenskit implements Runnable {
public static void main(String[] args) {
    HelloLenskit hello = new HelloLenskit(args);
    try {
        hello.run();
    } catch (RuntimeException e) {
        System.err.println(e.getMessage());
        System.exit(1);
    }
}

private String delimiter = "\t";
private File inputFile = new File("C:\\Users\\sean\\Desktop\\ml-100k\\u - Copy.data");
private List<Long> users;

public HelloLenskit(String[] args) {
    int nextArg = 0;
    boolean done = false;
    while (!done && nextArg < args.length) {
        String arg = args[nextArg];
        if (arg.equals("-e")) {
            delimiter = args[nextArg + 1];
            nextArg += 2;
        } else if (arg.startsWith("-")) {
            throw new RuntimeException("unknown option: " + arg);
        } else {
            inputFile = new File(arg);
            nextArg += 1;
            done = true;
        }
    }
    users = new ArrayList<Long>(args.length - nextArg);
    for (; nextArg < args.length; nextArg++) {
        users.add(Long.parseLong(args[nextArg]));
    }
}

public void run() {
    // We first need to configure the data access.
    // We will use a simple delimited file; you can use something else like
    // a database (see JDBCRatingDAO).
    EventDAO base = new SimpleFileRatingDAO(inputFile, "\t");
    // Reading directly from CSV files is slow, so we'll cache it in memory.
    // You can use SoftFactory here to allow ratings to be expunged and re-read
    // as memory limits demand. If you're using a database, just use it directly.
    EventDAO dao = new EventCollectionDAO(Cursors.makeList(base.streamEvents()));

    // Second step is to create the LensKit configuration...
    LenskitConfiguration config = new LenskitConfiguration();
    // ... configure the data source
    config.bind(EventDAO.class).to(dao);
    // ... and configure the item scorer.  The bind and set methods
    // are what you use to do that. Here, we want an item-item scorer.
    config.bind(ItemScorer.class)
          .to(ItemItemScorer.class);

    // let's use personalized mean rating as the baseline/fallback predictor.
    // 2-step process:
    // First, use the user mean rating as the baseline scorer
    config.bind(BaselineScorer.class, ItemScorer.class)
           .to(UserMeanItemScorer.class);
    // Second, use the item mean rating as the base for user means
    config.bind(UserMeanBaseline.class, ItemScorer.class)
          .to(ItemMeanRatingItemScorer.class);
    // and normalize ratings by baseline prior to computing similarities
    config.bind(UserVectorNormalizer.class)
          .to(BaselineSubtractingUserVectorNormalizer.class);

    // There are more parameters, roles, and components that can be set. See the
    // JavaDoc for each recommender algorithm for more information.

    // Now that we have a factory, build a recommender from the configuration
    // and data source. This will compute the similarity matrix and return a recommender
    // that uses it.
    Recommender rec = null;
    try {
        rec = LenskitRecommender.build(config);
    } catch (RecommenderBuildException e) {
        throw new RuntimeException("recommender build failed", e);
    }

    // we want to recommend items
    ItemRecommender irec = rec.getItemRecommender();
    assert irec != null; // not null because we configured one
    // for users
    for (long user: users) {
        // get 10 recommendation for the user
        List<ScoredId> recs = irec.recommend(user, 10);
        System.out.format("Recommendations for %d:\n", user);
        for (ScoredId item: recs) {
            System.out.format("\t%d\n", item.getId());
        }
    }
}
}

我真的迷失了这一点,并希望有任何帮助。谢谢你的时间。

4

1 回答 1

1

输入文件的最后一行只包含一个字段。每个输入文件行需要包含 3 或 4 个字段。

于 2014-08-12T14:38:54.113 回答