3

I have 5 entities:

public class Album
{
    public int Id { get; set; }

    public string Title { get; set; }

    public virtual List<AlbumArtist> AlbumArtists { get; set; }
    public virtual List<Artist> Artists { get; set; }
    public virtual List<Genre> Genres { get; set; }
    public virtual List<Song> Songs { get; set; }

}

public class AlbumArtist
{
    public int Id { get; set; }

    public string Title { get; set; }

    public virtual List<Album> Albums { get; set; }
    public virtual List<Artist> Artists { get; set; }
    public virtual List<Genre> Genres { get; set; }
    public virtual List<Song> Songs { get; set; }
}

public class Artist
{
    public int Id { get; set; }

    public string Title { get; set; }

    public virtual List<AlbumArtist> AlbumArtists { get; set; }
    public virtual List<Album> Albums { get; set; }
    public virtual List<Genre> Genres { get; set; }
    public virtual List<Song> Songs { get; set; }
}

public class Genre
{
    public int Id { get; set; }

    public string Title { get; set; }

    public virtual List<AlbumArtist> AlbumArtists { get; set; }
    public virtual List<Album> Albums { get; set; }
    public virtual List<Artist> Artists { get; set; }
    public virtual List<Song> Songs { get; set; }
}

public class Song
{
    public int Id { get; set; }

    public string Title { get; set; }

    public virtual List<AlbumArtist> AlbumArtists { get; set; }
    public virtual List<Album> Albums { get; set; }
    public virtual List<Artist> Artists { get; set; }
    public virtual List<Genre> Genres { get; set; }
}

As you can see, there are a lot of many-to-many relationships. I populate my entities and then try to save them to DbContext in that way:

_albumArtists.ForEach(delegate(AlbumArtist albumArtist)
{
    if (albumArtist.Id == 0)
    {
            _dbContext.Entry(entity).State = EntityState.Added;
            _dbContext.SaveChanges();
    }
    else
    {
            _dbContext.Entry(entity).State = EntityState.Modified;
            _dbContext.SaveChanges();
    }
});
...

or in that way:

_albumArtists.ForEach(delegate(AlbumArtist albumArtist)
{
    if (albumArtist.Id == 0)
    {
            _dbContext.Entry(entity).State = EntityState.Added;
    }
    else
    {
            _dbContext.AlbumArtists.State = EntityState.Modified;
    }
});
_dbContext.SaveChanges();
...

It takes forever to save my entities to DbContext. I even tried to do the following:

Configuration.AutoDetectChangesEnabled = false;

But it didn't helped. By the way, there are for about 17 000 Songs and 1 700 Albums.

What is wrong???

Please help!

PS

Here is my full code: https://github.com/vjacheslavravdin/PsyTrance/blob/master/PsyTrance/Program.cs Maybe you can suggest how to simplify it.

Thanks!

4

1 回答 1

8

首先澄清几点:

对于基于批处理的操作,EF 并不比其他方法慢很多。在我的测试中,使用原始 SQL 命令可能会提高 50%,使用 SQL 批量复制可能会快 10 倍,但作为一般规则,EF 并不比比较方法慢多少(尽管通常被认为非常慢)。对于大多数应用程序,即使在批处理场景中,如果进行了正确的调整,EF 也会给出合适的性能数字。(请参阅我的文章:http: //blog.staticvoid.co.nz/2012/3/24/entity_framework_comparative_performancehttp://blog.staticvoid.co.nz/2012/8/17/mssql_and_large_insert_statements

由于 EF 进行更改跟踪的方式,它有可能远远超过大多数人编写基于 SqlCommand 的插入语句的性能(有很多与查询计划、往返和事务有关的问题,这使得它很难编写最佳执行批量插入语句)。我在这里(http://entityframework.codeplex.com/discussions/377636)提出了对 EF 的这些补充,但还没有实现它们。

您决定关闭自动检测更改是完全正确的,每个带有检测更改的 .Add 或 .Attach 操作都会枚举跟踪图,因此,如果您在同一上下文中添加 17k 个添加项,则需要枚举图 17000总计 17000 + 16999 + ...+ 2 + 1 = 144,500,000 个实体的倍数,难怪需要这么长时间对吧?(请参阅我的文章:http: //blog.staticvoid.co.nz/2012/5/7/entityframework_performance_and_autodetectchanges

保存更改总是需要枚举跟踪图(它在内部调用检测更改),因此您的第一种方法会很慢,因为它实际上将执行与上述相同数量的跟踪调用。

第二种方法要好得多,但它仍然有一个相当大的缺陷,我认为这是双重的,首先,当您保存更改时,图表非常大(更大的图表具有成倍增加的跟踪时间),其次它会占用一个大量内存可以一次保存整个图形,特别是考虑到 EF 存储每个实体的两个副本。

一个更好的方法是将你的图表以块的形式持久化。一些

//With Auto detect changes off.
foreach(var batch in batches)//keep batch size below 1000 items, play around with the numbers a little
{
    using(var ctx = new MyContext())//make sure you create a new context per batch.
    {
        foreach(var entity in batch){
             ctx.Entities.Add(entity);
        }
        ctx.SaveChanges();
    }
}

我希望您应该在 17-30 秒左右完成所有 17k 行。

通过使用原始 SQL 命令执行此操作,您可以将其缩短到 12-20 秒左右;

通过批量复制的重新实现,您可能可以将其缩短到 2-5 秒

于 2013-09-06T00:08:11.243 回答