java - 多线程对象的创建速度比单线程慢

Question

我有一个可能是一个基本问题。当我在单核上创建 1 亿个 Hashtables 时，我的机器上大约需要 6 秒（运行时间 = 每个核心 6 秒）。如果我在 12 个内核上执行此多线程操作（我的机器有 6 个允许超线程的内核），则大约需要 10 秒（运行时间 = 每个内核 112 秒）。

这是我使用的代码：

主要的

public class Tests 
{
public static void main(String args[])
{
    double start = System.currentTimeMillis();
    int nThreads = 12;
    double[] runTime = new double[nThreads];

    TestsThread[] threads = new TestsThread[nThreads];
    int totalJob = 100000000;
    int jobsize = totalJob/nThreads;
    for(int i = 0; i < threads.length; i++)
    {
        threads[i] = new TestsThread(jobsize,runTime, i);
        threads[i].start();
    }
    waitThreads(threads);
    for(int i = 0; i < runTime.length; i++)
    {
        System.out.println("Runtime thread:" + i + " = " + (runTime[i]/1000000) + "ms");
    }
    double end = System.currentTimeMillis();
    System.out.println("Total runtime = " + (end-start) + " ms");
}

private static void waitThreads(TestsThread[] threads) 
{
    for(int i = 0; i < threads.length; i++)
    {
        while(threads[i].finished == false)//keep waiting untill the thread is done
        {
            //System.out.println("waiting on thread:" + i);
            try {
                Thread.sleep(1);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }
    }   
}
}

线

import java.util.HashMap;
import java.util.Map;

public class TestsThread extends Thread
{
int jobSize = 0;
double[] runTime;
boolean finished;
int threadNumber;

TestsThread(int job, double[] runTime, int threadNumber)
{
    this.finished = false;
    this.jobSize = job;
    this.runTime = runTime;
    this.threadNumber = threadNumber;
}

public void run()
{
    double start = System.nanoTime();
    for(int l = 0; l < jobSize ; l++)
    {   
         double[] test = new double[65];
    }
    double end = System.nanoTime();
    double difference = end-start;
    runTime[threadNumber] += difference;
    this.finished = true;
}
}

我不明白为什么在多个线程中同时创建对象每个线程需要更长的时间，然后只在 1 个线程中串行执行。如果我删除创建哈希表的行，这个问题就会消失。如果有人可以帮助我解决这个问题，我将不胜感激。

score 1 · Accepted Answer

更新：这个问题有一个相关的错误报告，并已用Java 1.7u40. 这从来都不是问题，Java 1.8因为 Java 8 有一个完全不同的哈希表算法。

由于您没有使用创建的对象，因此操作将得到优化。所以你只是在测量创建线程的开销。这肯定是您启动的线程越多，开销就越大。

我必须更正我关于一个细节的答案，我还不知道：类Hashtable和HashMap. 它们都sun.misc.Hashing.randomHashSeed(this)在构造函数中调用。换句话说，它们的实例在构造过程中逃逸，这对内存可见性有影响。这意味着它们的构造，不像我们说的那样ArrayList，不能优化掉，并且由于该方法内部发生的事情（即同步），多线程构造会变慢。

如前所述，这对于这些类来说是特别的，当然还有这个实现（我的设置：）1.7.0_13。对于普通类，此类代码的构建时间直接为零。

在这里，我添加了一个更复杂的基准代码。DO_HASH_MAP = true观察和之间的区别DO_HASH_MAP = false（当false它创建一个ArrayList没有这种特殊行为的替代时）。

import java.util.*;
import java.util.concurrent.*;

public class AllocBench {
  static final int NUM_THREADS = 1;
  static final int NUM_OBJECTS = 100000000 / NUM_THREADS;
  static final boolean DO_HASH_MAP = true;

  public static void main(String[] args) throws InterruptedException, ExecutionException {
    ExecutorService threadPool = Executors.newFixedThreadPool(NUM_THREADS);
    Callable<Long> task=new Callable<Long>() {
      public Long call() {
        return doAllocation(NUM_OBJECTS);
      }
    };

    long startTime=System.nanoTime(), cpuTime=0;
    for(Future<Long> f: threadPool.invokeAll(Collections.nCopies(NUM_THREADS, task))) {
      cpuTime+=f.get();
    }
    long time=System.nanoTime()-startTime;
    System.out.println("Number of threads: "+NUM_THREADS);
    System.out.printf("entire allocation required %.03f s%n", time*1e-9);
    System.out.printf("time x numThreads %.03f s%n", time*1e-9*NUM_THREADS);
    System.out.printf("real accumulated cpu time %.03f s%n", cpuTime*1e-9);

    threadPool.shutdown();
  }

  static long doAllocation(int numObjects) {
    long t0=System.nanoTime();
    for(int i=0; i<numObjects; i++)
      if(DO_HASH_MAP) new HashMap<Object, Object>(); else new ArrayList<Object>();
    return System.nanoTime()-t0;
  }
}

score 0 · Accepted Answer

由于您所做的只是测量时间和搅动内存，因此您的瓶颈很可能在您的 L3 高速缓存或到主内存的总线中。在这种情况下，协调线程之间的工作可能会产生如此多的开销，从而变得更糟而不是更好。

这对于评论来说太长了，但是您的内部循环可能只是

double start = System.nanoTime();
for(int l = 0; l < jobSize ; l++){
    Map<String,Integer> test = new HashMap<String,Integer>();
}
// runtime is an AtomicLong for thread safety
runtime.addAndGet(System.nanoTime() - start); // time in nano-seconds.

花时间创建 HashMap 可能会很慢，因此如果您过于频繁地调用计时器，您可能无法衡量您的想法。

BTW Hashtable 是同步的，您可能会发现使用 HashMap 更快，并且可能更具可扩展性。

score 0 · Accepted Answer

如果你在 6 核上做呢？超线程与双核并不完全相同，因此您可能也想尝试实际内核的数量。

此外，操作系统不一定会将您的每个线程安排到它们自己的核心。

java - 多线程对象的创建速度比单线程慢

3 回答 3

Related

Reference