java - 多线程编程的非预期结果

Question

我在使用多线程 Java 程序时遇到了麻烦。该程序由具有多线程的整数数组的拆分总和组成，而不是切片的总和。问题是计算时间不会通过增加线程数来减少（我知道在计算时间比线程数少之后线程数是有限的）。我希望在线程数限制之前看到执行时间减少（并行执行的好处）。我在运行方法中使用变量 fake 来使时间“可读”。

public class MainClass {

private final int MAX_THREAD = 8;
private final int ARRAY_SIZE = 1000000;

private  int[] array;
private SimpleThread[] threads;
private int numThread = 1;
private int[] sum;
private int start = 0;
private int totalSum = 0;
long begin, end;
int fake;


MainClass() {
    fillArray();

    for(int i = 0; i < MAX_THREAD; i++) {
        threads = new SimpleThread[numThread];
        sum = new int[numThread];

        begin = (long) System.currentTimeMillis();

        for(int j = 0 ; j < numThread; j++) {
            threads[j] = new SimpleThread(start, ARRAY_SIZE/numThread, j);
            threads[j].start();
            start+= ARRAY_SIZE/numThread;
        }



        for(int k = 0; k < numThread; k++) {
            try {
                threads[k].join();
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }


        end = (long) System.currentTimeMillis();


        for(int g = 0; g < numThread; g++) {
            totalSum+=sum[g];
        }


        System.out.printf("Result with %d thread-- Sum = %d Time = %d\n", numThread, totalSum, end-begin);
        numThread++;
        start = 0;
        totalSum = 0;
    }

}


public static void main(String args[]) {
    new MainClass();
}


private void fillArray() {
    array = new int[ARRAY_SIZE];
    for(int i = 0; i < ARRAY_SIZE; i++) 
        array[i] = 1;
}


private class SimpleThread extends Thread{
    int start;
    int size;
    int index;

    public SimpleThread(int start, int size, int sumIndex) {
        this.start = start;
        this.size = size;
        this.index = sumIndex;
    }

    public void run() {
        for(int i = start; i < start+size; i++) 
            sum[index]+=array[i];

        for(long i = 0; i < 1000000000; i++) {
            fake++;
        }
    }
}

意外结果截图

score 0 · Accepted Answer

启动线程很繁重，您只会在不竞争相同资源的大型进程上看到它的好处（这里都不适用）。

score 0 · Accepted Answer

为什么总和有时是错误的？

因为ARRAY_SIZE/numThread可能有小数部分（例如 1000000/3=333333.3333333333）被四舍五入，所以start变量会丢失一些，因此总和可能小于1000000除数的值。

为什么随着线程数量的增加所花费的时间也在增加？

因为在每个线程的运行函数中，你这样做：

for(long i = 0; i < 1000000000; i++) {
    fake++;
}

我不明白你的问题：

我在运行方法中使用变量 fake 来使时间“可读”。

那是什么意思。但是每个线程都需要将fake变量增加 1000000000 次。

score 0 · Accepted Answer

作为一般规则，如果每个线程执行的“工作”小于使用线程的开销，您将不会从多线程中获得加速。

其中一项开销是启动新线程的成本。这是惊人的高。每次启动线程时，JVM 都需要执行系统调用来分配线程堆栈内存段和“红色区域”内存段，并对其进行初始化。（默认线程堆栈大小通常为 500KB 或 1MB。）然后还有进一步的系统调用来创建本机线程并对其进行调度。

在此示例中，您有 1,000,000 个元素要求和，并且您将这项工作分配给 N 个线程。随着 N 的增加，每个线程执行的工作量减少。

不难看出，求和 1,000,000 个元素所需的时间将少于启动 4 个线程所需的时间……仅基于对内存读取和写入操作的计数。然后您需要考虑到子线程是由父线程一次创建一个。

如果您完全进行分析，很明显，即使您有足够的内核来并行运行所有线程，添加更多线程实际上也会减慢计算速度。而且您的基准测试似乎表明¹那个点大约是 2 个线程。

顺便说一句，对于像这样的基准测试，您可能无法获得预期的加速还有第二个原因。每个线程所做的“工作”基本上是扫描一个大数组。读写数组会产生对内存系统的请求。理想情况下，这些请求将由（快速）片上内存缓存来满足。但是，如果您尝试读取/写入大于内存缓存的数组，那么这些请求中的许多/大部分会变成（慢）主内存请求。更糟糕的是，如果你有 N 个核心都在这样做，那么你会发现主内存请求的数量太多，内存系统无法跟上......并且线程速度变慢。

底线是多线程不会自动使应用程序更快，如果你以错误的方式执行它肯定不会。

在您的示例中：

与创建和启动线程的开销相比，每个线程的工作量太小，并且
如果可以“排除”线程创建开销，内存带宽效应可能会成为一个问题

^{1 - 我不明白“假”计算的意义。它可能会使基准测试无效，尽管 JIT 编译器可能会对其进行优化。}

score 0 · Accepted Answer

作为旁注，对于您正在尝试做的事情，有 Fork/Join-Framework。它使您可以轻松地递归拆分任务并实现一种算法，该算法将自动分配您的工作量。

这里有一个指南；它的示例与您的情况非常相似，归结为RecursiveTask这样：

class Adder extends RecursiveTask<Integer>
{
    private int[] toAdd;
    private int from;
    private int to;

    /** Add the numbers in the given array */
    public Adder(int[] toAdd)
    {
        this(toAdd, 0, toAdd.length);
    }

    /** Add the numbers in the given array between the given indices;
        internal constructor to split work */
    private Adder(int[] toAdd, int fromIndex, int upToIndex)
    {
        this.toAdd = toAdd;
        this.from = fromIndex;
        this.to = upToIndex;
    }

    /** This is the work method */
    @Override
    protected Integer compute()
    {
        int amount = to - from;
        int result = 0;
        if (amount < 500)
        {
            // base case: add ints and return the result
            for (int i = from; i < to; i++)
            {
                result += toAdd[i];
            }
        }
        else
        {
            // array too large: split it into two parts and distribute the actual adding
            int newEndIndex = from + (amount / 2);
            Collection<Adder> invokeAll = invokeAll(Arrays.asList(
                    new Adder(toAdd, from, newEndIndex),
                    new Adder(toAdd, newEndIndex, to)));
            for (Adder a : invokeAll)
            {
                result += a.invoke();
            }
        }
        return result;
    }
}

要实际运行它，您可以使用

RecursiveTask adder = new Adder(fillArray(ARRAY_LENGTH));
int result = ForkJoinPool.commonPool().invoke(adder);

java - 多线程编程的非预期结果

4 回答 4

Related

Reference