我必须处理 1M 实体来构建事实。应该有大约相同数量的结果事实(100 万)。
我遇到的第一个问题是实体框架的批量插入速度很慢。所以我在实体框架中使用了这种模式最快的插入方式(来自 SLauma 的回答)。我现在可以在一分钟内快速插入实体约 100K。
我遇到的另一个问题是缺少处理所有内容的内存。所以我已经“分页”了处理。为了避免内存不足异常,如果我从我的 100 万个结果事实中列出一个列表,我会得到。
我遇到的问题是,即使有分页,内存也总是在增长,我不明白为什么。每批后没有内存被释放。我认为这很奇怪,因为我在循环的每次迭代中获取侦察构建事实并将它们存储到数据库中。一旦循环完成,那些应该从内存中释放。但看起来不是因为每次迭代后都没有释放内存。
在我挖掘更多之前,你能告诉我你是否发现了什么问题?更具体地说,为什么在 while 循环的迭代后没有释放内存。
static void Main(string[] args)
{
ReceiptsItemCodeAnalysisContext db = new ReceiptsItemCodeAnalysisContext();
var recon = db.Recons
.Where(r => r.Transacs.Where(t => t.ItemCodeDetails.Count > 0).Count() > 0)
.OrderBy( r => r.ReconNum);
// used for "paging" the processing
var processed = 0;
var total = recon.Count();
var batchSize = 1000; //100000;
var batch = 1;
var skip = 0;
var doBatch = true;
while (doBatch)
{ // list to store facts processed during the batch
List<ReconFact> facts = new List<ReconFact>();
// get the Recon items to process in this batch put them in a list
List<Recon> toProcess = recon.Skip(skip).Take(batchSize)
.Include(r => r.Transacs.Select(t => t.ItemCodeDetails))
.ToList();
// to process real fast
Parallel.ForEach(toProcess, r =>
{ // processing a recon and adding the facts to the list
var thisReconFacts = ReconFactGenerator.Generate(r);
thisReconFacts.ForEach(f => facts.Add(f));
Console.WriteLine(processed += 1);
});
// saving the facts using pattern provided by Slauma
using (TransactionScope scope = new TransactionScope(TransactionScopeOption.Required, new System.TimeSpan(0, 15, 0)))
{
ReceiptsItemCodeAnalysisContext context = null;
try
{
context = new ReceiptsItemCodeAnalysisContext();
context.Configuration.AutoDetectChangesEnabled = false;
int count = 0;
foreach (var fact in facts.Where(f => f != null))
{
count++;
Console.WriteLine(count);
context = ContextHelper.AddToContext(context, fact, count, 250, true); //context.AddToContext(context, fact, count, 250, true);
}
context.SaveChanges();
}
finally
{
if (context != null)
context.Dispose();
}
scope.Complete();
}
Console.WriteLine("batch {0} finished continuing", batch);
// continuing the batch
batch++;
skip = batchSize * (batch - 1);
doBatch = skip < total;
// AFTER THIS facts AND toProcess SHOULD BE RESET
// BUT IT LOOKS LIKE THEY ARE NOT OR AT LEAST SOMETHING
// IS GROWING IN MEMORY
}
Console.WriteLine("Processing is done {} recons processed", processed);
}
Slauma 提供的使用实体框架优化批量插入的方法。
class ContextHelper
{
public static ReceiptsItemCodeAnalysisContext AddToContext(ReceiptsItemCodeAnalysisContext context,
ReconFact entity, int count, int commitCount, bool recreateContext)
{
context.Set<ReconFact>().Add(entity);
if (count % commitCount == 0)
{
context.SaveChanges();
if (recreateContext)
{
context.Dispose();
context = new ReceiptsItemCodeAnalysisContext();
context.Configuration.AutoDetectChangesEnabled = false;
}
}
return context;
}
}