1

I am using Threadpool in a C# application that need to do some CPU-intensive work. By the way it seems too slow (EDIT: it prints out debug string "Calculating on " + lSubArea.X + ":" + lSubArea.Y + " " + lSubArea.Width + ":" + lSubArea.Height only few times every 10 seconds, while I'm expecting to see that at least NUM_ROWS_GRID^2 = 16 times every few seconds), also changing MinThreads via SetMinThreads method. I don't know if switch to custom threads or if there's a way to speed up it. Searching on Google returns me some result but nothing works; same situation with MSDN.

Old Code follows:

private void StreamerRoutine()
{
   if (this._state.Area.Width == 0 && this._state.Area.Height == 0)
      this._state.Area = new Rectangle(0, 0, Screen.PrimaryScreen.Bounds.Width, Screen.PrimaryScreen.Bounds.Height);

   while (this._state.WorkEnd == false)
   {
      // Ends time slice if video is off
      if (this._state.VideoOn == false)
         Thread.Sleep(0);
      else
      {
         lock(this._state.AreaSync)
         {
             Int32 lWidth = this._state.Area.Width / Constants.NUM_ROWS_GRID;
             Int32 lHeight = this._state.Area.Height / Constants.NUM_ROWS_GRID;
             for (Int32 lX = 0; lX + lWidth <= this._state.Area.Width; lX += lWidth)
                for (Int32 lY = 0; lY + lHeight <= this._state.Area.Height; lY += lHeight)
                   ThreadPool.QueueUserWorkItem(CreateDiffFrame, (Object)new Rectangle(lX, lY, lWidth, lHeight));
         }
      }
    }
}

private void CreateDiffFrame(Object pState)
{
   Rectangle lSubArea = (Rectangle)pState;

   SmartDebug.DWL("Calculating on " 
          + lSubArea.X + ":" + lSubArea.Y + " " 
          + lSubArea.Width + ":" + lSubArea.Height);
   // TODO : calculate frame
   Thread.Sleep(0);
}

EDIT: CreateDiffFrame function is only a stub I used to know how many times it is called per second. It will be replaced with CPU intensive work as I define the best way to use thread in this case.

EDIT: I removed all Thread.Sleep(0); I thought it could be a way to speed up routine but it seems it could be a bottleneck.. new code follows:

EDIT: I made WorkEnd and VideoOn volatile in order to avoid cached values and so endless loop; I added also a semaphore to make every bunch of work items start after previous bunch is done.. now it is working quite well

private void StreamerRoutine()
    {
        if (this._state.Area.Width == 0 && this._state.Area.Height == 0)
            this._state.Area = new Rectangle(0, 0, Screen.PrimaryScreen.Bounds.Width, Screen.PrimaryScreen.Bounds.Height);

        this._state.StreamingSem = new Semaphore(Constants.NUM_ROWS_GRID * Constants.NUM_ROWS_GRID, Constants.NUM_ROWS_GRID * Constants.NUM_ROWS_GRID);


        while (this._state.WorkEnd == false)
        {
            if (this._state.VideoOn == true)
            {
                for (int i = 0; i < Constants.NUM_ROWS_GRID * Constants.NUM_ROWS_GRID; i++)
                    this._state.StreamingSem.WaitOne();

                lock(this._state.AreaSync)
                {
                    Int32 lWidth = this._state.Area.Width / Constants.NUM_ROWS_GRID;
                    Int32 lHeight = this._state.Area.Height / Constants.NUM_ROWS_GRID;
                    for (Int32 lX = 0; lX + lWidth <= this._state.Area.Width; lX += lWidth)
                        for (Int32 lY = 0; lY + lHeight <= this._state.Area.Height; lY += lHeight)
                            ThreadPool.QueueUserWorkItem(CreateDiffFrame, (Object)new Rectangle(lX, lY, lWidth, lHeight));

                }
            }
        }
    }

private void CreateDiffFrame(Object pState)
    {
        Rectangle lSubArea = (Rectangle)pState;

        SmartDebug.DWL("Calculating on " + lSubArea.X + ":" + lSubArea.Y + " " + lSubArea.Width + ":" + lSubArea.Height);
        // TODO : calculate frame
        this._state.StreamingSem.Release(1);

    }
4

3 回答 3

3

从我所看到的情况来看,确实没有一种好的方法可以准确地告诉您是什么使您的代码变慢,但是有几件事很突出:

  1. 线程.睡眠(0)。当你这样做时,你放弃了操作系统的其余时间片,并减慢了一切,因为 CreateDiffFrame() 在操作系统调度程序返回之前实际上无法返回。

  2. Rectangle 的对象转换,它是一个结构。当这种情况发生时,您会产生装箱的开销,这对于真正的计算密集型操作来说并不是您想要的。

  3. 您对 lock(this._state.AreaSync) 的调用。也可能是 AreaSync 也被锁定在其他地方,这可能会减慢速度。

  4. 您可能对项目进行了过于精细的排队——如果您对非常小的工作项目进行排队,与实际完成的工作量相比,一次将这些项目放入队列中的开销可能会很大。您也许还可以考虑将内部循环的内容放在排队的工作项中,以减少这种开销。

如果这是您尝试为并行计算做的事情,您是否使用PLINQ或其他此类框架进行过调查?

于 2011-12-10T15:03:33.123 回答
0

我的猜测是它是 CreateDiffFrame 末尾的睡眠。如果我没记错的话,这意味着每个线程至少还能再存活 10 毫秒。您可能可以在不到 10 毫秒的时间内完成实际工作。ThreadPool 试图优化线程的使用,但我认为它对未完成线程的总数有一个上限。因此,如果您想实际模拟您的工作负载,请创建一个紧密循环,等待直到经过预期的毫秒数,而不是休眠。

无论如何,我不认为使用 ThreadPool 是真正的瓶颈,使用其他线程机制不会加快你的代码。

于 2011-12-10T15:00:40.270 回答
0

KB976898ThreadPool.SetMinThreads中描述的方法存在已知错误:

在 Microsoft .NET Framework 3.5 中使用 ThreadPool.SetMinThreads 方法后,线程池维护的线程无法按预期工作

您可以从此处下载此行为的修复程序。

于 2011-12-10T15:39:09.860 回答