3

对于 HashMap<Integer,Integer>,在插入 10000000 个唯一随机值后。我使用 hashmap 的 keySet() 执行 get(),如下面的代码片段所示:

HashMap<Integer, Integer> hashmap = 
                        new HashMap<Integer, Integer>(10000000, 0.99f);

// ... Code to put unique 10000000 associations into the hashmap ...

int iteration = 100;
long startTime, totalTime = 0;

while(iteration > 0) {
    for(Integer key: hashmap.keySet()) {
       startTime = System.currentTimeMillis();
       hashmap.get(key);
       totalTime += (System.currentTimeMillis() - startTime);
    }
    iteration--;
}
System.out.println(totalTime/100 + " ms");

运行上面的代码,我得到:225 ms

现在,如果我将上面的代码更改为使用集合,例如以下代码段:

Set<Integer> set = new HashSet<Integer>(hashmap.keySet());
while(iteration > 0) {
    for(Integer key: set) {
       startTime = System.currentTimeMillis();
       hashmap.get(key);
       totalTime += (System.currentTimeMillis() - startTime);
    }
    iteration--;
}
System.out.println(totalTime/100 + " ms");

运行此代码后,我得到:414 ms

为什么会有这样的性能差异?

PS:我使用了以下 JVM 参数:

-Xms2048m -Xmx4096m -XX:MaxPermSize=256m
4

4 回答 4

3

When you read a large data structure (larger than 32 KB) how you read that data structure impact performance.

These are the typical sizes and speeds of you caches.

L1:   32 KB, 4 clock cycles.
L2:  256 KB, 11 clock cycles.
L3: 3-30 MB, 40-75 clock cycles.
Main memory: up to 2TB, 200-500 clock cycles.

This means cache locality is very important. i.e. if you are reading something from L1 it can be 20x faster than reading from L3.

In your case you are using a Hash data structure. This is designed for random access and random arrangement which unfortunately mean it has very poor cacheability. Access memory randomly and it is likely to be in the slower areas of memory.

However, there is an exception to this. If you access the same data multiple times, e.g. get a key you just got from an iterator, or you are scanning through a collection in order e.g. this is what the iterator does for HashMap (not so much a TreeMap) it is much more likely that the next piece of data you will access is on the same cache line (each cache line is 64-bytes long) or the next one. These type of accesses perform much better as CPU are designed to perform vector operations very fast.

BTW Your working set is the set of keys, if your values are different objects I would expect this to be much slower if you actually look at those objects (as this increases the size of your working set and how much memory would be required to cache it)

于 2013-12-17T11:11:48.030 回答
2

这个

   startTime = System.currentTimeMillis();
   hashmap.get(key);
   totalTime += (System.currentTimeMillis() - startTime);

是微基准测试的荒谬尝试。它使用currentTimeMillis()精度为 1 ms,实际精度在 10 ms 以上,以测量纳秒操作。即使nanoTime单独也不会帮助您,因为它的准确性通常以微秒为单位。

此外,代码不执行任何预热。

如果您想测量像单个map#get调用的性能这样难以捉摸的东西,您最好使用适当的微基准测试工具。

于 2013-12-17T10:55:16.290 回答
2

毫秒精度不足以测量单个 get()。在循环开始和循环结束时读取时间 - 不要尝试在内部的部分中增加它,因为这样做会导致大量潜在的准确性错误淹没任何实际结果。

确保在不计时的情况下运行循环 50 次(预热 JVM,确保所有内容都已编译等),然后再次运行它来计时整个循环过程:

Set<Integer> set = new HashSet<Integer>(hashmap.keySet());
startTime = System.currentTimeMillis();
while(iteration > 0) {
    for(Integer key: set) {
       hashmap.get(key);
    }
    iteration--;
}
totalTime = (System.currentTimeMillis() - startTime);
System.out.println(totalTime + " ms");

当您除以迭代时,您的代码如何没有除以 0 错误?

于 2013-12-17T10:53:37.470 回答
0

找出这两个类的性能的逻辑是不正确的。

测量键集上完整迭代所花费的时间(最好以纳秒精度),而不是测量每次调用 get 方法所花费的时间。要证明你的事实,结果应该是一致的。只有这样才能证明你的事实。

此外,性能很大程度上取决于 JVM 和 GC 配置。

于 2013-12-17T10:58:16.610 回答