您可能需要该方法的更高级重载之一Parallel.For
:
public static ParallelLoopResult For<TLocal>(int fromInclusive, int toExclusive,
ParallelOptions parallelOptions, Func<TLocal> localInit,
Func<int, ParallelLoopState, TLocal, TLocal> body,
Action<TLocal> localFinally);
使用线程本地数据执行 for 循环,其中迭代可以并行运行,可以配置循环选项,并且可以监视和操作循环的状态。
对于它所期望的所有各种 lambda,这看起来非常令人生畏。这个想法是让每个线程处理本地数据,最后合并数据。以下是如何使用此方法来解决您的问题:
double[] A = new double[1000];
double[] B = (double[])A.Clone();
object locker = new object();
var parallelOptions = new ParallelOptions()
{
MaxDegreeOfParallelism = Environment.ProcessorCount
};
Parallel.For(0, A.Length, parallelOptions,
localInit: () => new double[A.Length], // create temp array per thread
body: (i, state, temp) =>
{
double v = A[i];
temp[i] -= v;
temp[i + 1] += v / 2;
temp[i - 1] += v / 2;
return temp; // return a reference to the same temp array
}, localFinally: (localB) =>
{
// Can be called in parallel with other threads, so we need to lock
lock (locker)
{
for (int i = 0; i < localB.Length; i++)
{
B[i] += localB[i];
}
}
});
我应该提一下,上述示例的工作负载过于细化,因此我不希望并行化能够大幅提高性能。希望您的实际工作量更大。例如,如果您有两个嵌套循环,则仅并行化外循环将非常有效,因为内循环将提供急需的块状。
替代解决方案:您可以直接更新 B 数组,而不是为每个线程创建辅助数组,并且仅在处理分区边界附近的危险区域中的索引时才使用锁:
Parallel.ForEach(Partitioner.Create(0, A.Length), parallelOptions, range =>
{
bool lockTaken = false;
try
{
for (int i = range.Item1; i < range.Item2; i++)
{
bool shouldLock = i < range.Item1 + 1 || i >= range.Item2 - 1;
if (shouldLock) Monitor.Enter(locker, ref lockTaken);
double v = A[i];
B[i] -= v;
B[i + 1] += v / 2;
B[i - 1] += v / 2;
if (shouldLock) { Monitor.Exit(locker); lockTaken = false; }
}
}
finally
{
if (lockTaken) Monitor.Exit(locker);
}
});