2

I have a mongo (version 2) in production in replicaset configuration (the next step is to add sharding).

I need to implement the following:

  • Once a day i'll receive a file with millions rows and i shall load it into mongo.
  • I have a runtime application that always read from this collection - very large amount of reads, and their performance is very important. The collection is indexed and all read perform readByIndex operation.

My current implementation of loading is:

  1. drop collection
  2. create collection
  3. insert into collection new documents

One of the thing I see is that because of mongoDB lock my total performance getting worst during the loading. I've checked the collection with up to 10Million entries. For more that that size I think I should start use sharding

What is the best way to love such issue? Or maybe should I use another solution strategy?

4

1 回答 1

1

你可以使用两个集合:)

  • collectionA 包含今天的数据
  • 新数据到来
  • 创建一个新集合(collectionB)并插入数据
  • 现在使用 collectionB 作为您的数据

然后,第二天,重复上述只是交换 A 和 B :)

这将让 collectionA 在更新 collectionB 时仍然服务请求。

PS刚刚注意到我回答这个问题大约晚了一年:)

于 2013-04-03T11:58:50.100 回答