我需要将电子邮件发送与电子邮件退回相匹配,以便我可以找到它们是否已送达。问题是,我必须将退回限制在发送后 4 天内,以消除匹配错误发送到退回。发送记录分布在 30 天内。
LinkedList<event_data> sent = GetMyHugeListOfSends(); //for example 1M+ records
List<event_data> bounced = GetMyListOfBounces(); //for example 150k records
bounced = bounced.OrderBy(o => o.event_date).ToList(); //this ensures the most accurate match of bounce to send (since we find the first match)
List<event_data> delivered = new List<event_data>();
event_data deliveredEmail = new event_data();
foreach (event_data sentEmail in sent)
{
event_data bounce = bounced.Find(item => item.email.ToLower() == sentEmail.email.ToLower() && (item.event_date > sentEmail.event_date && item.event_date < sentEmail.event_date.AddDays(deliveredCalcDelayDays)));
//create delivered records
if (bounce != null)
{
//there was a bounce! don't add a delivered record!
}
else
{
//if sent is not bounced, it's delivered
deliveredEmail.sid = siteid;
deliveredEmail.mlid = mlid;
deliveredEmail.mid = mid;
deliveredEmail.email = sentEmail.email;
deliveredEmail.event_date = sentEmail.event_date;
deliveredEmail.event_status = "Delivered";
deliveredEmail.event_type = "Delivered";
deliveredEmail.id = sentEmail.id;
deliveredEmail.number = sentEmail.number;
deliveredEmail.laststoretransaction = sentEmail.laststoretransaction;
delivered.Add(deliveredEmail); //add the new delivered
deliveredEmail = new event_data();
//remove bounce, it only applies to one send!
bounced.Remove(bounce);
}
if (bounced.Count() == 0)
{
break; //no more bounces to match!
}
}
所以我做了一些测试,它每秒处理大约 12 条发送记录。超过 100 万条记录,需要 25 多个小时来处理!
两个问题:
- 我怎样才能找到花费最多时间的确切线路?
- 我假设是 lambda 表达式找到了最长时间的反弹,因为在我把它放在那里之前这要快得多。我怎样才能加快速度?
谢谢!
编辑
---想法---
- 我刚刚想到的一个想法是按日期对发送进行排序,就像我对退回进行排序一样,这样通过退回的搜索将更加有效,因为早期发送也可能会遇到早期退回。
- 我刚刚想到的另一个想法是并行运行几个这样的进程,尽管我不喜欢多线程这个简单的应用程序。