java - 如何证明java中的HashMap不是线程安全的

Question

我正在开发一个使用HashMap共享状态的应用程序。我需要通过单元测试证明它在多线程环境中会出现问题。

我试图通过检查两者中的大小和元素来检查单线程环境和多线程环境中应用程序的状态HashMap。但这似乎无济于事，状态始终相同。

有没有其他方法可以证明或证明在地图上执行操作的应用程序可以很好地处理并发请求？

score 28 · Accepted Answer

这很容易证明。

不久

哈希映射基于一个数组，其中每个项目代表一个桶。随着更多键的添加，存储桶会增长，并且在某个阈值时，会以更大的大小重新创建数组，以便其存储桶分布更均匀（性能考虑）。在数组重新创建期间，数组变为空，这导致调用者的结果为空，直到重新创建完成。

细节和证明

这意味着有时HashMap#put()会在内部调用HashMap#resize()以使底层数组更大。

HashMap#resize()为该字段分配一个容量更大table的新空数组，并用旧项目填充它。虽然发生了这种填充，但底层数组并不包含所有旧项目HashMap#get()，并且使用现有键调用可能会返回null.

下面的代码演示了这一点。您很可能会遇到异常，这意味着HashMap它不是线程安全的。我选择了目标键65 535- 这样它将成为数组中的最后一个元素，因此是重新填充期间的最后一个元素，这增加了继续的可能性null（HashMap#get()了解原因，请参阅HashMap#put()实现）。

final Map<Integer, String> map = new HashMap<>();

final Integer targetKey = 0b1111_1111_1111_1111; // 65 535
final String targetValue = "v";
map.put(targetKey, targetValue);

new Thread(() -> {
    IntStream.range(0, targetKey).forEach(key -> map.put(key, "someValue"));
}).start();


while (true) {
    if (!targetValue.equals(map.get(targetKey))) {
        throw new RuntimeException("HashMap is not thread safe.");
    }
}

一个线程向地图添加新键。另一个线程不断检查targetKey是否存在。

如果算上这些异常，我会绕过200 000.

score 8 · Accepted Answer

很难模拟 Race，但请查看 OpenJDK 源代码中put()的 HashMap 方法：

public V put(K key, V value) {
    if (key == null)
        return putForNullKey(value);

    //Operation 1       
    int hash = hash(key.hashCode());
    int i = indexFor(hash, table.length);
    for (Entry<K,V> e = table[i]; e != null; e = e.next) {
        Object k;
        if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
            V oldValue = e.value;
            e.value = value;
            e.recordAccess(this);
            return oldValue;
        }
    } 

    //Operation 2
    modCount++;

    //Operation 3
    addEntry(hash, key, value, i);
    return null;
}

如您所见put()，涉及 3 个不同步的操作。复合操作是非线程安全的。所以理论上证明它HashMap不是线程安全的。

score 6 · Accepted Answer

我需要通过单元测试证明它在多线程环境中会出现问题。

这将是非常难以做到的。比赛条件很难证明。您当然可以编写一个程序，该程序在大量线程中对 HashMap 进行置入和获取，但是volatile应用程序的日志记录、字段、其他锁和其他计时细节可能会使您的特定代码无法执行非常困难。

这是一个愚蠢的小HashMap故障测试用例。它失败是因为当线程由于内存损坏而进入无限循环时超时HashMap。但是，根据内核数量和其他架构细节，它可能不会失败。

@Test(timeout = 10000)
public void runTest() throws Exception {
    final Map<Integer, String> map = new HashMap<Integer, String>();
    ExecutorService pool = Executors.newFixedThreadPool(10);
    for (int i = 0; i < 10; i++) {
        pool.submit(new Runnable() {
            @Override
            public void run() {
                for (int i = 0; i < 10000; i++) {
                    map.put(i, "wow");
                }
            }
        });
    }
    pool.shutdown();
    pool.awaitTermination(Long.MAX_VALUE, TimeUnit.MILLISECONDS);
}

score 5 · Accepted Answer

阅读 API 文档就够了吗？里面有一个说法：

请注意，此实现不同步。如果多个线程同时访问一个哈希映射，并且至少有一个线程在结构上修改了映射，则必须在外部进行同步。（结构修改是添加或删除一个或多个映射的任何操作；仅更改与实例已包含的键关联的值不是结构修改。）这通常通过在自然封装映射的某个对象上同步来完成. 如果不存在这样的对象，则应使用 Collections.synchronizedMap 方法“包装”地图。这最好在创建时完成，以防止对地图的意外不同步访问：

线程安全的问题是很难通过测试来证明。大多数时候都可以。你最好的选择是只运行一堆正在获取/放置的线程，你可能会遇到一些并发错误。

我建议使用 aConcurrentHashMap并相信 Java 团队说不HashMap同步就足够了。

score 5 · Accepted Answer

还有其他方法可以证明吗？

如何阅读文档（并注意强调的“必须”）：

如果多个线程同时访问一个hash map，并且至少有一个线程在结构上修改了map，则必须对外同步

如果您要尝试编写一个演示不正确行为的单元测试，我建议您执行以下操作：

创建一堆具有相同哈希码的键（比如 30 或 40）
为每个键添加值到地图
为键生成一个单独的线程，它有一个无限循环，(1) 断言该键存在于映射中，(2) 删除该键的映射，以及 (3) 将映射添加回来。

如果幸运的话，断言会在某个时候失败，因为哈希桶后面的链表会损坏。如果你不走运，HashMap尽管有文档，它看起来确实是线程安全的。

score 5 · Accepted Answer

它是一个旧线程。但只是粘贴我的示例代码，它能够演示 hashmap 的问题。

看看下面的代码，我们尝试使用 10 个线程（每个线程 3000 个项目）将 30000 个项目插入到 hashmap 中。

因此，在所有线程完成后，理想情况下，您应该看到 hashmap 的大小应该是 30000。但实际输出要么是重建树时出现异常，要么最终计数小于 30000。

class TempValue {
    int value = 3;

    @Override
    public int hashCode() {
        return 1; // All objects of this class will have same hashcode.
    }
}

public class TestClass {
    public static void main(String args[]) {
        Map<TempValue, TempValue> myMap = new HashMap<>();
        List<Thread> listOfThreads = new ArrayList<>();

        // Create 10 Threads
        for (int i = 0; i < 10; i++) {
            Thread thread = new Thread(() -> {

                // Let Each thread insert 3000 Items
                for (int j = 0; j < 3000; j++) {
                    TempValue key = new TempValue();
                    myMap.put(key, key);
                }

            });
            thread.start();
            listOfThreads.add(thread);
        }

        for (Thread thread : listOfThreads) {
            thread.join();
        }
        System.out.println("Count should be 30000, actual is : " + myMap.size());
    }
}

输出 1：

Count should be 30000, actual is : 29486

输出 2：（例外）

java.util.HashMap$Node cannot be cast to java.util.HashMap$TreeNodejava.lang.ClassCastException: java.util.HashMap$Node cannot be cast to java.util.HashMap$TreeNode
    at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1819)
    at java.util.HashMap$TreeNode.treeify(HashMap.java:1936)
    at java.util.HashMap.treeifyBin(HashMap.java:771)
    at java.util.HashMap.putVal(HashMap.java:643)
    at java.util.HashMap.put(HashMap.java:611)
    at TestClass.lambda$0(TestClass.java:340)
    at java.lang.Thread.run(Thread.java:745)

但是，如果您将该行修改Map<TempValue, TempValue> myMap = new HashMap<>();为ConcurrentHashMap，则输出始终为 30000。

另一个观察：在上面的例子中，所有TempValue类对象的哈希码都是相同的（** 即 1**）。所以你可能想知道，HashMap 的这个问题可能只在发生冲突的情况下发生（由于哈希码）。我尝试了另一个例子。

将 TempValue 类修改为

class TempValue {
    int value = 3;
}

现在重新执行相同的代码。
在每 5 次运行中，我看到 2-3 次运行仍然给出与 30000 不同的输出。
因此，即使您通常没有太多的碰撞，您可能仍然会遇到问题。（可能是由于HashMap的重建等原因）

总的来说，这些示例显示了 ConcurrentHashMap 处理的 HashMap 问题。

score 1 · Accepted Answer

这可能是可能的，但永远不会是一个完美的测试。竞争条件太不可预测了。话虽如此，我编写了一个类似类型的测试来帮助解决专有数据结构的线程问题，在我的例子中，证明某事是错误的（在修复之前）比证明什么都不会发生要容易得多错误（修复后）。您可能会构建一个多线程测试，该测试最终会因足够的时间和正确的参数而失败。

这篇文章可能有助于确定测试中需要关注的领域，并为可选替换提供一些其他建议。

score 1 · Accepted Answer

您可以创建多个线程，每个线程将一个元素添加到 hashmap 并对其进行迭代。即在运行方法中，我们必须使用“put”，然后使用迭代器进行迭代。

对于 HashMap 我们得到 ConcurrentModificationException 而对于 ConcurrentHashMap 我们没有得到。

score 0 · Accepted Answer

java.util.HashMap 实现中最可能的竞争条件

如果我们在调整大小或重新散列步骤执行时尝试读取值，则大多数 hashMaps 都会失败。如果超过存储桶阈值，则在某些条件下最常执行的调整大小和重新散列操作。这段代码证明，如果我在外部调用调整大小，或者如果我放置的元素多于阈值并倾向于在内部调用调整大小操作，则会导致一些 null 读取，这表明 HashMap 不是线程安全的。应该有更多的竞争条件，但足以证明它不是线程安全的。

比赛条件的实际证明

import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import java.util.HashMap;
import java.util.Map;
import java.util.stream.IntStream;

public class HashMapThreadSafetyTest {
    public static void main(String[] args) {
        try {
            (new HashMapThreadSafetyTest()).testIt();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    private void threadOperation(int number, Map<Integer, String> map) {
        map.put(number, "hashMapTest");
        while (map.get(number) != null);
        //If code passes to this line that means we did some null read operation which should not be
        System.out.println("Null Value Number: " + number);
    }
    private void callHashMapResizeExternally(Map<Integer, String> map)
            throws NoSuchMethodException, InvocationTargetException, IllegalAccessException {
        Method method = map.getClass().getDeclaredMethod("resize");
        method.setAccessible(true);
        System.out.println("calling resize");
        method.invoke(map);
    }

    private void testIt()
            throws InterruptedException, NoSuchMethodException, IllegalAccessException, InvocationTargetException {
        final Map<Integer, String> map = new HashMap<>();
        IntStream.range(0, 12).forEach(i -> new Thread(() -> threadOperation(i, map)).start());
        Thread.sleep(60000);
        // First loop should not show any null value number untill calling resize method of hashmap externally.
        callHashMapResizeExternally(map);
        // First loop should fail from now on and should print some Null Value Numbers to the out.
        System.out.println("Loop count is 12 since hashmap initially created for 2^4 bucket and threshold of resizing"
                + "0.75*2^4 = 12 In first loop it should not fail since we do not resizing hashmap. "
                + "\n\nAfter 60 second: after calling external resizing operation with reflection should forcefully fail"
                + "thread safety");

        Thread.sleep(2000);
        final Map<Integer, String> map2 = new HashMap<>();
        IntStream.range(100, 113).forEach(i -> new Thread(() -> threadOperation(i, map2)).start());
        // Second loop should fail from now on and should print some Null Value Numbers to the out. Because it is
        // iterating more than 12 that causes hash map resizing and rehashing
        System.out.println("It should fail directly since it is exceeding hashmap initial threshold and it will resize"
                + "when loop iterate 13rd time");
    }
}

示例输出

No null value should be printed untill thread sleep line passed
calling resize
Loop count is 12 since hashmap initially created for 2^4 bucket and threshold of resizing0.75*2^4 = 12 In first loop it should not fail since we do not resizing hashmap. 

After 60 second: after calling external resizing operation with reflection should forcefully failthread safety
Null Value Number: 11
Null Value Number: 5
Null Value Number: 6
Null Value Number: 8
Null Value Number: 0
Null Value Number: 7
Null Value Number: 2
It should fail directly since it is exceeding hashmap initial threshold and it will resizewhen loop iterate 13th time
Null Value Number: 111
Null Value Number: 100
Null Value Number: 107
Null Value Number: 110
Null Value Number: 104
Null Value Number: 106
Null Value Number: 109
Null Value Number: 105

score 0 · Accepted Answer

证明这一点的非常简单的解决方案

这是代码，它证明了 Hashmap 实现不是线程安全的。在这个例子中，我们只是将元素添加到 map，而不是从任何方法中删除它。

我们可以看到它打印了不在 map 中的键，即使我们在执行 get 操作之前已经将相同的键放在了 map 中。

package threads;

import java.util.HashMap;
import java.util.Map;
import java.util.Random;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

public class HashMapWorkingDemoInConcurrentEnvironment {

    private Map<Long, String> cache = new HashMap<>();

    public String put(Long key, String value) {
        return cache.put(key, value);
    }

    public String get(Long key) {
        return cache.get(key);
    }

    public static void main(String[] args) {

        HashMapWorkingDemoInConcurrentEnvironment cache = new HashMapWorkingDemoInConcurrentEnvironment();

        class Producer implements Callable<String> {

            private Random rand = new Random();

            public String call() throws Exception {
                while (true) {
                    long key = rand.nextInt(1000);
                    cache.put(key, Long.toString(key));
                    if (cache.get(key) == null) {
                        System.out.println("Key " + key + " has not been put in the map");
                    }
                }
            }
        }

        ExecutorService executorService = Executors.newFixedThreadPool(4);

        System.out.println("Adding value...");

        try  {
            for (int i = 0; i < 4; i++) {
                executorService.submit(new Producer());
            }
        } finally {
            executorService.shutdown();
        }
    }
}

执行运行的示例输出

Adding value...
Key 611 has not been put in the map
Key 978 has not been put in the map
Key 35 has not been put in the map
Key 202 has not been put in the map
Key 714 has not been put in the map
Key 328 has not been put in the map
Key 606 has not been put in the map
Key 149 has not been put in the map
Key 763 has not been put in the map

看到打印的值很奇怪，这就是为什么 hashmap 不是在并发环境中工作的线程安全实现。

score 0 · Accepted Answer

OpenJDK 团队开源了一个很棒的工具，叫做 JCStress，它在 JDK 中用于并发测试。

https://github.com/openjdk/jcstress

在其中一个示例中：https ://github.com/openjdk/jcstress/blob/master/tests-custom/src/main/java/org/openjdk/jcstress/tests/collections/HashMapFailureTest.java

@JCStressTest
@Outcome(id = "0, 0, 1, 2", expect = Expect.ACCEPTABLE, desc = "No exceptions, entire map is okay.")
@Outcome(expect = Expect.ACCEPTABLE_INTERESTING, desc = "Something went wrong")
@State
public class HashMapFailureTest {

    private final Map<Integer, Integer> map = new HashMap<>();

    @Actor
    public void actor1(IIII_Result r) {
        try {
            map.put(1, 1);
            r.r1 = 0;
        } catch (Exception e) {
            r.r1 = 1;
        }
    }

    @Actor
    public void actor2(IIII_Result r) {
        try {
            map.put(2, 2);
            r.r2 = 0;
        } catch (Exception e) {
            r.r2 = 1;
        }
    }

    @Arbiter
    public void arbiter(IIII_Result r) {
        Integer v1 = map.get(1);
        Integer v2 = map.get(2);
        r.r3 = (v1 != null) ? v1 : -1;
        r.r4 = (v2 != null) ? v2 : -1;
    }

}

标有actor的方法在不同的线程上同时运行。

在我的机器上的结果是：

Results across all configurations:

       RESULT     SAMPLES     FREQ       EXPECT  DESCRIPTION
  0, 0, -1, 2   3,854,896    5.25%  Interesting  Something went wrong
  0, 0, 1, -1   4,251,564    5.79%  Interesting  Something went wrong
  0, 0, 1, 2  65,363,492   88.97%   Acceptable  No exceptions, entire map is okay.

这表明在 88% 的时间内观察到了预期值，但在大约 12% 的时间里，看到了不正确的结果。

您可以试用此工具和示例并编写自己的测试来验证某些代码的并发性是否被破坏。

java - 如何证明java中的HashMap不是线程安全的

11 回答 11

不久

细节和证明

java.util.HashMap 实现中最可能的竞争条件

比赛条件的实际证明

示例输出

Related

Reference