c# - 优化日历应用的查询和/或数据模型

Question

我们的日历应用程序将约会域表示为：

预约

身份证（PK）
开始日期时间
结束日期时间
...

任命角色

约会 ID (FK)
PersonOrGroupID (FK) /* 加入一个人/组，超出了这个问题的范围 */
角色
...

Appointment 与 AppointmentRoles 具有一对多的关系。每个 AppointmentRole 代表一个具有特定角色的人或组（例如，下车、接机、出席……）。

这种关系有两个目的：

它定义了一个访问控制列表——经过身份验证的主体只有在其访问控制列表与关联的个人或组匹配时才能查看约会
它记录了谁参加了约会以及担任什么角色。

还有第三个表格来跟踪与约会相关的注释/评论。它与 Appointment 存在一对多关系的多方面：

预约须知

约会 ID (FK)
...

要显示约会日历，我们目前使用类似...

List<IAppointment> GetAppointments(IAccess acl, DateTime start, DateTime end, ...
{
  // Retrieve distinct appointments that are visible to the acl

  var visible = (from appt in dc.Appointments
                 where !(appt.StartDateTime >= end || appt.EndDateTime <= start)
                 join role in
                   (from r in dc.Roles
                    where acl.ToIds().Contains(r.PersonOrGroupID)
                    select new { r.AppointmentID })
                 on appt.ID equals role.AppointmentID
                 select new
                 {
                   ...
                 }).Distinct();

  ...

可见的Linq 表达式选择给定访问控制列表可以看到的不同约会。

下面，我们采取可见并加入/进入角色和笔记，以收集与约会和约会笔记有关的所有人员和团体。

  ...

  // Join/into to get all appointment roles and notes

  var q = from appt in visible
          orderby appt.StartDateTime, ...
          join r in dc.Roles
          on appt.ID equals r.AppointmentID
          into roles
          join note in dc.AppointmentNotes
          on appt.ID equals note.AppointmentID
          into notes
          select new { Appointment = appt, Roles = roles, Notes = notes };

最后，我们枚举查询，希望 Linq-To-Sql 会生成一个优化得惊人的查询（没有后面讨论的运气）......

  // Marshal the anonymous type into an IAppointment
  // IAppointment has a Roles and Notes collection

  var result = new List<IAppointment>();
  foreach (var record in q)
  {
    IAppointment a = new Appointment();
    a.StartDateTime = record.StartDateTime;
    ...
    a.Roles = Marshal(record.Roles);
    a.Notes = Marshal(record.Notes);

    result.Add(a);
  }

Linq-to-Sql 产生的查询非常健谈。它生成一个查询来确定可见的约会。但随后它会在每次迭代中生成三个查询：一个用于获取约会字段，第二个用于获取角色，第三个用于获取笔记。where 子句始终是可见的约会 ID。

因此，我们正在重构 GetAppointments，并认为我们可以从 SO 社区的专业知识中受益。

我们希望将所有内容都移动到 T-SQL 存储过程中，以便我们拥有更多控制权。你能分享你对如何解决这个问题的想法吗？数据模型的更改、T-SQL 和 Linq-to-SQL 的修改都是公平的游戏。我们还想对索引提出建议。我们正在使用 MS-SqlServer 2008 和 .NET 4.0。

score 3 · Accepted Answer

我想说万恶之源从这里开始：

where acl.ToIds().Contains(r.PersonOrGroupID)

这acl.ToIds().Contains(...)是一个无法在服务器端解析的表达式，因此visible必须在客户端解析查询（非常无效），更糟糕的是，结果必须保留在客户端，然后在迭代时，thre对于每个可见的约会（约会字段、角色和注释），必须将不同的查询发送到服务器。如果我有自己的想法，我会创建一个存储过程，它接受 ACL 列表作为表值参数，并在服务器端执行所有加入/过滤。

我将从这个模式开始：

create table Appointments (
    AppointmentID int not null identity(1,1),
    Start DateTime not null,
    [End] DateTime not null,
    Location varchar(100),
    constraint PKAppointments
        primary key nonclustered (AppointmentID));

create table AppointmentRoles (
    AppointmentID int not null,
    PersonOrGroupID int not null,
    Role int not null,
    constraint PKAppointmentRoles
        primary key (PersonOrGroupID, AppointmentID), 
    constraint FKAppointmentRolesAppointmentID
        foreign key (AppointmentID)
        references Appointments(AppointmentID));

create table AppointmentNotes (
    AppointmentID int not null,
    NoteId int not null,
    Note varchar(max),

    constraint PKAppointmentNotes
        primary key (AppointmentID, NoteId),
    constraint FKAppointmentNotesAppointmentID
        foreign key (AppointmentID)
        references Appointments(AppointmentID));
go

create clustered index cdxAppointmentStart on Appointments (Start, [End]);
go

并检索任意 ACL 的约会，如下所示：

create type AccessControlList as table 
    (PersonOrGroupID int not null);
go

create procedure usp_getAppointmentsForACL
 @acl AccessControlList readonly,
 @start datetime,
 @end datetime
as
begin
    set nocount on;
    select a.AppointmentID
        , a.Location
        , r.Role
        , n.NoteID
        , n.Note
    from @acl l 
    join AppointmentRoles r on l.PersonOrGroupID = r.PersonOrGroupID
    join Appointments a on r.AppointmentID = a.AppointmentID
    join AppointmentNotes n on n.AppointmentID = a.AppointMentID
    where a.Start >= @start
    and a.[End] <= @end;    
end
go

让我们在 1M 的约会上试试这个。首先，填充表格（大约需要 4-5 分钟）：

set nocount on;
declare @i int = 0;
begin transaction;
while @i < 1000000
begin
    declare @start datetime, @end datetime;
    set @start = dateadd(hour, rand()*10000-5000, getdate());
    set @end = dateadd(hour, rand()*100, @start)
    insert into Appointments (Start, [End], Location)
    values (@start, @end, replicate('X', rand()*100));

    declare @appointmentID int = scope_identity();
    declare @atendees int = rand() * 10.00 + 1.00;
    while @atendees > 0
    begin
        insert into AppointmentRoles (AppointmentID, PersonOrGroupID, Role)
        values (@appointmentID, @atendees*100 + rand()*100, rand()*10);
        set @atendees -= 1;
    end

    declare @notes int = rand()*3.00;
    while @notes > 0
    begin
        insert into AppointmentNotes (AppointmentID, NoteID, Note)
        values (@appointmentID, @notes, replicate ('Y', rand()*1000));
        set @notes -= 1;
    end

    set @i += 1;
    if @i % 10000 = 0
    begin
        commit;
        raiserror (N'Added %i appointments...', 0, 1, @i);
        begin transaction;
    end
end
commit;
go

所以让我们看看今天几个人的约会：

set statistics time on;
set statistics io on;

declare @acl AccessControlList;
insert into @acl (PersonOrGroupID) values (102),(111),(131);
exec usp_getAppointmentsForACL @acl, '20100730', '20100731';

Table 'AppointmentNotes'. Scan count 8, logical reads 39, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Appointments'. Scan count 1, logical reads 9829, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'AppointmentRoles'. Scan count 3, logical reads 96, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#25869641'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 63 ms,  elapsed time = 1294 ms.

 SQL Server Execution Times:
   CPU time = 63 ms,  elapsed time = 1294 ms.

1.2 秒（在冷缓存上，它在热缓存上达到 224 毫秒）。嗯，这不是很好。问题是约会表中的 9829 页命中。为了改进这一点，我们希望同时拥有两个过滤条件（acl和日期）。也许是索引视图？

create view vwAppointmentAndRoles 
with schemabinding
as
select r.PersonOrGroupID, a.AppointmentID, a.Start, a.[End]
from dbo.AppointmentRoles r
join dbo.Appointments a on r.AppointmentID = a.AppointmentID;
go

create unique clustered index cdxVwAppointmentAndRoles on vwAppointmentAndRoles (PersonOrGroupID, Start, [End]);
go

alter procedure usp_getAppointmentsForACL
 @acl AccessControlList readonly,
 @start datetime,
 @end datetime
as
begin
    set nocount on;
    select ar.AppointmentID
        , a.Location
        , r.Role
        , n.NoteID
        , n.Note
    from @acl l 
    join vwAppointmentAndRoles ar with (noexpand) on l.PersonOrGroupID = ar.PersonOrGroupID
    join AppointmentNotes n on n.AppointmentID = ar.AppointMentID
    join Appointments a on ar.AppointmentID = a.AppointmentID
    join AppointmentRoles r 
        on ar.AppointmentID = r.AppointmentID
        and ar.PersonOrGroupID = r.PersonOrGroupID
    where ar.Start >= @start
     and ar.Start <= @end
    and ar.[End] <= @end;   
end
go

我们还可以将 Appointments 上的聚集索引更改为可能更有用的 AppointmentID：

drop index cdxAppointmentStart on Appointments;
create clustered index cdxAppointmentAppointmentID on Appointments (AppointmentID);
go

这将在 77 毫秒内返回同一日期范围内同一 @acl 列表中的约会（在热缓存上）。

现在，当然，您应该使用的实际模式取决于更多未考虑的因素。但我希望这能让您对现在采取适当的行动以获得良好的性能有所了解。将表值参数添加到客户端执行上下文并将其传递给过程以及 LINQ 集成，留给读者作为练习。

score 2 · Accepted Answer

如果我理解正确，并且Appointment有一个集合Roles和一个Notes. 如果是这种情况（并且您在设计器中对其进行了正确建模），那么您在类中有这些Roles和Notes属性Appointment。当您更改查询的投影 (the select) 时，q选择它Appointment本身，您可以帮助 LINQ to SQL 为您获取以下集合。在这种情况下，您应该按如下方式编写查询：

var q =
    from appt in visible
    ...
    select appt;

在此之后，您可以使用的LoadOptions属性DataContext为您预取子集合，如下所示：

using (var db = new AppointmentContext())
{
    db.LoadOptions.LoadWith<Appointment>(a => a.Roles);

    // Do the rest here
}

然而这里的一个问题是，我认为它LoadWith仅限于加载单个子集合，而不是两个。

您可以通过在两个查询中写出来来解决这个问题。第一个查询是您获取约会并用于LoadWith获取所有Roles. 然后使用第二个查询（在 new 中DataContext）并使用LoadWithfetch all Notes）。

祝你好运。

score 1 · Accepted Answer

where !(appt.StartDateTime >= end || appt.EndDateTime <= start)

这可能是一个非常好的 AND 标准。

where appt.StartDateTime < end && start < appt.EndDateTime

acl.ToIds().

把它从查询中拉出来，要求数据库执行操作是没有意义的。

List<int> POGIDs = acl.ToIds();

join role in

您想将角色用作过滤器。如果你在哪里，而不是加入，你以后不必区分。

试试这个，有和没有 DataLoadOptions。如果没有 DataLoadOptions 查询很好，还有另一种（更手动的）方法来加载相关行。

DataLoadOptions myOptions = new DataLoadOptions();
myOptions.LoadWith<Appointment>(appt => appt.Roles);
myOptions.LoadWith<Appointment>(appt => appt.Notes);
dc.LoadOptions = myOptions;


List<int> POGIDs = acl.ToIds();

IQueryable<Roles> roleQuery = dc.Roles
  .Where(r => POGIDs.Contains(r.PersonOrGroupId));

IQueryable<Appointment> visible =
  dc.Appointments
    .Where(appt => appt.StartDateTime < end && start < appt.EndDateTime)
    .Where(appt => appt.Roles.Any(r => roleQuery.Contains(r));

IQueryable<Appointment> q =
  visible.OrderBy(appt => appt.StartDateTime);

List<Appointment> rows = q.ToList();

这是获取相关数据的“更手动”的方式。注意：当 apptIds 或 POGIDs 中包含超过 ~2100 个整数时，此技术会中断。也有办法解决这个问题......

List<int> POGIDs = acl.ToIds();

List<Role> visibleRoles = dc.Roles
  .Where(r => POGIDs.Contains(r.PersonOrGroupId)
  .ToList()

List<int> apptIds = visibleRoles.Select(r => r.AppointmentId).ToList();

List<Appointment> appointments = dc.Appointments
  .Where(appt => appt.StartDateTime < end && start < appt.EndDate)
  .Where(appt => apptIds.Contains(appt.Id))
  .OrderBy(appt => appt.StartDateTime)
  .ToList();

ILookup<int, Roles> appointmentRoles = dc.Roles
  .Where(r => apptIds.Contains(r.AppointmentId))
  .ToLookup(r => r.AppointmentId);

ILookup<int, Notes> appointmentNotes = dc.AppointmentNotes
  .Where(n => apptIds.Contains(n.AppointmentId));
  .ToLookup(n => n.AppointmentId);

foreach(Appointment record in appointments)
{
  int key = record.AppointmentId;
  List<Roles> theRoles = appointmentRoles[key].ToList();
  List<Notes> theNotes = appointmentNotes[key].ToList();
}

这种风格突出了需要索引的地方：

Roles.PersonOrGroupId
Appointments.AppointmentId (should be PK already)
Roles.AppointmentId
Notes.AppointmentId

c# - 优化日历应用的查询和/或数据模型

3 回答 3

Related

Reference