c# - DocumentDB 在提取大量记录时是否比 SQL 慢？

Question

我正在做一些基准测试，所以我有一个包含 2500 条记录的 SQL 数据库。我将这些记录插入 DocumentDB。

我写了两行代码，其中一行使用实体框架将所有 2500 拉到 C# 中的数组中。下一行将所有 2500 从DocuementDB 拉到一个数组中。

使用的代码：

var test= await Task<Test>.Run(() =>
              client.CreateDocumentQuery<Test>(collection.DocumentsLink)
              .ToList());

DocumentDB 示例耗时超过 20 秒。SQL Server 线几乎是即时的。这些对象是具有 5 个属性的简单 DTO，我通过 Internet 进行了 SQL 查询。

我在滥用 DocumentDB 吗？我认为它是为了将你所有的记录拉入内存然后加入 linq。

score 15 · Accepted Answer

@bladefist，您应该能够使用 DocumentDB 获得更好的性能。例如，看看这个代码存根和来自西欧的 Azure VM 和 DocumentDB 帐户的输出。

Stopwatch watch = new Stopwatch();
for (int i = 0; i < 10; i++)
{
    watch.Start();
    int numDocumentsRead = 0;
    foreach (Document d in client.CreateDocumentQuery(collection.SelfLink, 
        new FeedOptions { MaxItemCount = 1000 }))
    {
        numDocumentsRead++;
    }

    Console.WriteLine("Run {0} - read {1} documents in {2} ms", i, numDocumentsRead, 
        watch.Elapsed.TotalMilliseconds);
    watch.Reset();
}

//Output
Run 0 - read 2500 documents in 426.1359 ms
Run 1 - read 2500 documents in 286.506 ms
Run 2 - read 2500 documents in 227.4451 ms
Run 3 - read 2500 documents in 270.4497 ms
Run 4 - read 2500 documents in 275.7205 ms
Run 5 - read 2500 documents in 281.571 ms
Run 6 - read 2500 documents in 268.9624 ms
Run 7 - read 2500 documents in 275.1513 ms
Run 8 - read 2500 documents in 301.0263 ms
Run 9 - read 2500 documents in 288.1455 ms

一些性能最佳实践：

使用直接连接和 TCP 协议
如果您要大批量阅读，请使用较大的页面大小（最大：1000）以最大程度地减少往返次数
要减少延迟，请在与 DocumentDB 帐户相同的区域中运行您的客户端
您购买的容量单位的预置吞吐量（和存储）分布在各个集合中。因此，如果您想测量吞吐量，您应该确保您的应用程序将工作负载分配到所有集合中。例如，如果您购买了 1 个 CU，您可以选择将所有吞吐量分配到单个集合或跨三个集合。

c# - DocumentDB 在提取大量记录时是否比 SQL 慢？

1 回答 1

Related

Reference