0

I have a program which color codes a returned results set a certain way depending on what the results are. Due to the length of time it takes to color-code the results (currently being done with Regex and RichTextBox.Select + .SelectionColor), I cut off color-coding at 400 results. At around that number it takes about 20 seconds, which is just about max time of what I'd consider reasonable.

To try an improve performance I re-wrote the Regex part to use a Parallel.ForEach loop to iterate through the MatchCollection, but the time was about the same (18-19 seconds vs 20)! Is just not a job that lends itself to Parallel programming very well? Should I try something different? Any advice is welcome. Thanks!

PS: Thought it was a bit strange that my CPU utilization never went about 14%, with or without Parallel.ForEach.

Code

MatchCollection startMatches = Regex.Matches(tempRTB.Text, startPattern);

object locker = new object();
System.Threading.Tasks.Parallel.ForEach(startMatches.Cast<Match>(), m =>
{
    int i = 0;
    foreach (Group g in m.Groups)
    {
        if (i > 0 && i < 5 && g.Length > 0)
        {
            tempRTB.Invoke(new Func<bool>(
                delegate
                {
                    lock (locker)
                    {
                        tempRTB.Select(g.Index, g.Length);
                        if ((i & 1) == 0) // Even number
                            tempRTB.SelectionColor = Namespace.Properties.Settings.Default.ValueColor;
                        else              // Odd number
                            tempRTB.SelectionColor = Namespace.Properties.Settings.Default.AttributeColor;
                        return true;
                    }
                }));
        }
        else if (i == 5 && g.Length > 0)
        {
            var result = tempRTB.Invoke(new Func<string>(
                delegate
                {
                    lock (locker)
                    {
                        return tempRTB.Text.Substring(g.Index, g.Length);
                    }
                }));

            MatchCollection subMatches = Regex.Matches((string)result, pattern);

            foreach (Match subMatch in subMatches)
            {
                int j = 0;
                foreach (Group subGroup in subMatch.Groups)
                {
                    if (j > 0 && subGroup.Length > 0)
                    {
                        tempRTB.Invoke(new Func<bool>(
                            delegate
                            {
                                lock (locker)
                                {
                                    tempRTB.Select(g.Index + subGroup.Index, subGroup.Length);
                                    if ((j & 1) == 0) // Even number
                                        tempRTB.SelectionColor = Namespace.Properties.Settings.Default.ValueColor;
                                    else              // Odd number
                                        tempRTB.SelectionColor = Namespace.Properties.Settings.Default.AttributeColor;
                                    return true;
                                }
                            }));
                    }
                    j++;
                }
            }
        }
        i++;
    }
});
4

2 回答 2

4

代码中的大部分时间很可能花在实际选择富文本框中的文本并设置颜色的部分。

这段代码不可能并行执行,因为它必须被编组到 UI 线程——你可以通过tempRTB.Invoke.

此外,您明确确保突出显示不是并行执行,而是使用lock语句顺序执行。这是不必要的,因为所有这些代码无论如何都在单个 UI 线程上运行。


您可以尝试通过在 RTB 中选择文本并为其着色时暂停 UI 的布局来提高性能:

tempRTB.SuspendLayout();

// your loop

tempRTB.ResumeLayout();
于 2013-05-14T14:34:53.420 回答
4

实际上,您的程序的任何方面都无法并行运行。

匹配的生成需要按顺序完成。在找到第一个匹配之前,它无法找到第二个匹配。 Parallel.ForEach充其量只会允许您并行处理序列的结果,但它们仍然是按顺序生成的。这似乎是您大部分耗时工作的地方,而且那里没有任何收获。

最重要的是,您也没有真正并行处理结果。在循环体中运行的大部分代码都在对 UI 线程的调用中,这意味着它们都由单个线程运行。

简而言之,实际上只有一小部分程序是并行运行的,并且通常使用并行化会增加一些开销;听起来你几乎没有得到比这个开销更多的东西。您实际上并没有做错太多,该操作本质上不适合并行化,除非有一种有效的方法可以将初始字符串分解为正则表达式可以单独(并行)解析的几个较小的卡盘。

于 2013-05-14T14:35:00.660 回答