3

我根据“时间”字段将系统日志存储在带有构建索引的mongodb中。mongodb 现在大约有 190k 日志。当我尝试使用Java 中的DBCollection.find()方法获取所有日志时,它花费了将近 10 秒来遍历集合中的所有文档。我认为我可能错过了一些导致性能不佳的东西?

这是我使用的代码:

mongo = new Mongo();
DB db = mongo.getDB("Log");
DBCollection coll = db.getCollection("SystemLog");
int count = 0;

long findStart = Calendar.getInstance().getTimeInMillis();

// Sort by time.
BasicDBObject queryObj = new BasicDBObject();
queryObj.put("time", -1);

DBCursor cursor = coll.find().sort(queryObj);
while(cursor.hasNext()) {
    DBObject obj = cursor.next();
    // Do something
    ++count;
}
long findEnd = Calendar.getInstance().getTimeInMillis();
System.out.println("Time for traversing all system logs (" + count + "):\t" + (findEnd-findStart) + "ms.");

打印结果是:

Time for traversing all system log (194309):    10496ms.

我已经试过好几次了。运行一次或多次似乎没有区别。虽然我也尝试删除sort()并从 mongodb 中找到所有日志。遍历所有文档大约需要 6 秒。时间对于我的要求来说还是有点难以接受。有没有可以加快遍历工作的实现技巧?

非常感谢。

4

1 回答 1

1

Do you really need to traverse all documents ? In the code above it looks like you are just bringing into memory each object, one by one.

  1. The index on 'time' field should be constructed as 'descending' since you are sorting like that.
  2. If the index is compound (has more fields in the index, not just 'time') make sure you also add an index just with 'time'. Also, when you will add a filter to that query, make sure the 'time' field is added last in the index and descending.
  3. The performance is not that bad, considering you are reading 190k objects one by one.

(please note that my experience with mongodb does Not involve working with the Java driver)

于 2012-11-06T10:32:15.833 回答