I have a mongo (version 2) in production in replicaset configuration (the next step is to add sharding).
I need to implement the following:
- Once a day i'll receive a file with millions rows and i shall load it into mongo.
- I have a runtime application that always read from this collection - very large amount of reads, and their performance is very important. The collection is indexed and all read perform readByIndex operation.
My current implementation of loading is:
- drop collection
- create collection
- insert into collection new documents
One of the thing I see is that because of mongoDB lock my total performance getting worst during the loading. I've checked the collection with up to 10Million entries. For more that that size I think I should start use sharding
What is the best way to love such issue? Or maybe should I use another solution strategy?