4

upd我现在认为我的问题的根源不是“线程”,因为我在程序的任何时候都观察到减速。我认为不知何故,当使用 2 个处理器时,我的程序执行速度较慢,可能是因为两个处理器需要相互“通信”。我需要做一些测试。我将尝试禁用其中一个处理器,看看会发生什么。

=====================================

我不确定这是否是 C# 问题,可能更多关于硬件,但我认为 C# 将是最合适的。

我使用的是便宜的 DL120 服务器,我决定升级到更昂贵的 2 处理器 DL360p 服务器。出乎意料的是,我的 C# 程序在应该快几倍的新服务器上运行速度慢了大约 2 倍。

我处理了大约 60 种仪器的FAST数据。我为每个仪器创建了单独的任务,如下所示:

        BlockingCollection<OrderUpdate> updatesQuery;
        if (instrument2OrderUpdates.ContainsKey(instrument))
        {
            updatesQuery = instrument2OrderUpdates[instrument];
        } else
        {
            updatesQuery = new BlockingCollection<OrderUpdate>();
            instrument2OrderUpdates[instrument] = updatesQuery;
            ScheduleFastOrdersProcessing(updatesQuery);
        }
        orderUpdate.Checkpoint("updatesQuery.Add");
        updatesQuery.Add(orderUpdate);
    }

    private void ScheduleFastOrdersProcessing(BlockingCollection<OrderUpdate> updatesQuery)
    {
        Task.Factory.StartNew(() =>
        {
            Instrument instrument = null;
            OrderBook orderBook = null;
            int lastRptSeqNum = -1;
            while (!updatesQuery.IsCompleted)
            {
                OrderUpdate orderUpdate;
                try
                {
                    orderUpdate = updatesQuery.Take();
                } catch(InvalidOperationException e)
                {
                    Log.Push(LogItemType.Error, e.Message);
                    continue;
                }
                orderUpdate.Checkpoint("received from updatesQuery.Take()");
                ......................
                ...................... // long not interesting processing code
        }, TaskCreationOptions.LongRunning);

因为我有大约 60 个可以并行执行的任务,所以我希望 2 * E5-2640(24 个虚拟线程,12 个真实线程)的执行速度应该比 1 * E3-1220(4 个真实线程)快得多。似乎使用 DL360p 我在任务管理器中找到了 95 个线程。使用 DL120 我只有 55 个线程。

但是 DL120G7 的执行时间快了 2 (!!) 倍!E3-1220 的时钟频率比 E5-2640 好一点(3.1 GHz 与 2.5Ghz),但我仍然希望我的代码在 2 * E5-2640 上运行得更快,因为它可以更好地并行化,我绝对不期望它的工作速度要慢 2 倍!

惠普 DL120G7 E3-1220

任务管理器中约 50 个线程最好 = 24 平均约 80 微秒

 calling market.UpdateFastOrder = 23 updatesQuery.Add = 25 received from updatesQuery.Take() = 67 in orderbook = 80
 calling market.UpdateFastOrder = 30 updatesQuery.Add = 32 received from updatesQuery.Take() = 64 in orderbook = 73
 calling market.UpdateFastOrder = 31 updatesQuery.Add = 32 received from updatesQuery.Take() = 195 in orderbook = 204
 calling market.UpdateFastOrder = 31 updatesQuery.Add = 32 received from updatesQuery.Take() = 74 in orderbook = 86
 calling market.UpdateFastOrder = 18 updatesQuery.Add = 21 received from updatesQuery.Take() = 65 in orderbook = 78
 calling market.UpdateFastOrder = 29 updatesQuery.Add = 32 received from updatesQuery.Take() = 76 in orderbook = 88
 calling market.UpdateFastOrder = 30 updatesQuery.Add = 32 received from updatesQuery.Take() = 80 in orderbook = 92
 calling market.UpdateFastOrder = 20 updatesQuery.Add = 21 received from updatesQuery.Take() = 65 in orderbook = 78
 calling market.UpdateFastOrder = 21 updatesQuery.Add = 24 received from updatesQuery.Take() = 68 in orderbook = 81
 calling market.UpdateFastOrder = 12 updatesQuery.Add = 13 received from updatesQuery.Take() = 58 in orderbook = 72
 calling market.UpdateFastOrder = 22 updatesQuery.Add = 23 received from updatesQuery.Take() = 51 in orderbook = 59
 calling market.UpdateFastOrder = 16 updatesQuery.Add = 16 received from updatesQuery.Take() = 20 in orderbook = 24
 calling market.UpdateFastOrder = 28 updatesQuery.Add = 31 received from updatesQuery.Take() = 82 in orderbook = 94
 calling market.UpdateFastOrder = 18 updatesQuery.Add = 21 received from updatesQuery.Take() = 65 in orderbook = 77
 calling market.UpdateFastOrder = 29 updatesQuery.Add = 29 received from updatesQuery.Take() = 259 in orderbook = 264
 calling market.UpdateFastOrder = 49 updatesQuery.Add = 52 received from updatesQuery.Take() = 99 in orderbook = 113
 calling market.UpdateFastOrder = 22 updatesQuery.Add = 23 received from updatesQuery.Take() = 50 in orderbook = 60
 calling market.UpdateFastOrder = 29 updatesQuery.Add = 32 received from updatesQuery.Take() = 76 in orderbook = 88
 calling market.UpdateFastOrder = 16 updatesQuery.Add = 19 received from updatesQuery.Take() = 63 in orderbook = 75
 calling market.UpdateFastOrder = 27 updatesQuery.Add = 27 received from updatesQuery.Take() = 226 in orderbook = 231
 calling market.UpdateFastOrder = 15 updatesQuery.Add = 16 received from updatesQuery.Take() = 35 in orderbook = 42
 calling market.UpdateFastOrder = 18 updatesQuery.Add = 21 received from updatesQuery.Take() = 66 in orderbook = 78

惠普 DL360p G8 2 * E5-2640

任务管理器中约 95 个线程;最佳 = 40 平均 ~ 150 微秒

 calling market.UpdateFastOrder = 62 updatesQuery.Add = 64 received from updatesQuery.Take() = 144 in orderbook = 205
 calling market.UpdateFastOrder = 27 updatesQuery.Add = 32 received from updatesQuery.Take() = 101 in orderbook = 154
 calling market.UpdateFastOrder = 45 updatesQuery.Add = 50 received from updatesQuery.Take() = 124 in orderbook = 187
 calling market.UpdateFastOrder = 46 updatesQuery.Add = 51 received from updatesQuery.Take() = 127 in orderbook = 162
 calling market.UpdateFastOrder = 63 updatesQuery.Add = 68 received from updatesQuery.Take() = 137 in orderbook = 174
 calling market.UpdateFastOrder = 53 updatesQuery.Add = 55 received from updatesQuery.Take() = 133 in orderbook = 171
 calling market.UpdateFastOrder = 44 updatesQuery.Add = 46 received from updatesQuery.Take() = 131 in orderbook = 158
 calling market.UpdateFastOrder = 37 updatesQuery.Add = 39 received from updatesQuery.Take() = 102 in orderbook = 140
 calling market.UpdateFastOrder = 45 updatesQuery.Add = 50 received from updatesQuery.Take() = 115 in orderbook = 154
 calling market.UpdateFastOrder = 50 updatesQuery.Add = 55 received from updatesQuery.Take() = 133 in orderbook = 160
 calling market.UpdateFastOrder = 26 updatesQuery.Add = 50 received from updatesQuery.Take() = 99 in orderbook = 111
 calling market.UpdateFastOrder = 14 updatesQuery.Add = 30 received from updatesQuery.Take() = 36 in orderbook = 40   <-- best one I can find among thousands

你能明白为什么我的程序在快几倍的服务器上运行慢 2 倍吗?可能我不应该创建 ~60 任务?也许我应该告诉 .NET 不要使用 95 个线程,而是将其限制为 50 甚至 24 个?可能这是 2 个处理器与 1 个处理器的配置问题?可能只是禁用我的 DL360P Gen8 上的一个处理器会显着加速程序吗?

添加

  • 调用 market.UpdateFastOrder - 创建 orderUpdate 对象
  • updatesQuery.Add - orderUpdate 被放入 BlockingCollection
  • 从 updatesQuery.Take() 收到 - orderUpdate 从 BlockingCollection 中弹出
  • 在 orderbook - orderUpdated 被解析并应用于 orderBook
4

1 回答 1

0

仅仅因为你有一个可以处理更多线程的系统,这并不意味着它们都可以完全并行处理。

当我从四核 CPU 升级到 i7(虚拟 8 核)时,我注意到使用比核心更多的线程的设置会导致线程相互阻塞一段时间,从而导致系统整体速度变慢。

问题只是我的算法已经能够使用其线程正在运行的核心的全部处理时间,而等待线程仅在大约 5% 到 10% 上工作,这导致主线程完成但一些单线程仍然有完成所有工作(再次花费相同的时间)。

只有在所有工作线程都完成后,线程池才会继续,因此直到完成的总时间将是其他线程未使用的处理器时间。

也许您只需要找到最佳数量的线程。

于 2012-06-19T08:43:57.817 回答