java - 为什么要在多线程环境中使用 HashMap？

Question

今天我正在阅读 HashMap 如何在 java 中工作。我遇到了一个博客，我直接引用了博客的文章。我已经阅读了有关 Stackoverflow 的这篇文章。我还是想知道细节。

所以答案是肯定的，在 Java 中调整 HashMap 的大小时存在潜在的竞争条件，如果两个线程同时发现现在 HashMap 需要调整大小并且它们都尝试调整大小。在Java中调整HashMap大小的过程中，存储在链表中的bucket中的元素在迁移到新bucket期间按顺序颠倒，因为java HashMap不会在尾部附加新元素，而是在头部附加新元素避免尾部遍历。如果发生竞态条件，那么您最终将陷入无限循环。

它指出，由于 HashMap 在调整 HashMap 的大小期间不是线程安全的，因此可能会发生潜在的竞争条件。我什至在我们的办公室项目中看到，人们广泛使用 HashMaps，因为他们知道它们不是线程安全的。如果它不是线程安全的，那我们为什么要使用 HashMap 呢？是否只是开发人员缺乏知识，因为他们可能不了解 ConcurrentHashMap 之类的结构或其他原因。谁能解释一下这个谜题。

score 8 · Accepted Answer

我可以自信地说 ConcurrentHashMap 是一个相当被忽视的类。没有多少人知道它，也没有多少人愿意使用它。该类提供了一种非常健壮且快速的同步 Map 集合的方法。我在网上阅读了一些 HashMap 和 ConcurrentHashMap 的比较。让我说他们完全错了。您无法将两者进行比较，一种提供同步方法来访问地图，而另一种则不提供任何同步。

我们大多数人没有注意到的是，虽然我们的应用程序，尤其是 Web 应用程序，在开发和测试阶段运行良好，但它们通常会在重（甚至中等重）负载下向上倾斜。这是因为我们希望我们的 HashMap 以某种方式表现，但在负载下它们通常表现不佳。Hashtable 提供对其条目的并发访问，但有一点需要注意的是，整个地图被锁定以执行任何类型的操作。

虽然这种开销在正常负载下的 Web 应用程序中是可以忽略的，但在重负载下，它可能会导致响应时间延迟和服务器负担过重。这就是 ConcurrentHashMap 介入的地方。它们提供了 Hashtable 的所有功能，性能几乎与 HashMap 一样好。ConcurrentHashMap 通过一个非常简单的机制来实现这一点。

默认情况下，该集合维护一个包含 16 个锁的列表，而不是映射范围的锁，每个锁用于保护（或锁定）映射的单个存储桶。这实际上意味着 16 个线程可以一次修改集合（只要它们都在不同的存储桶上工作）。事实上，这个集合没有执行任何锁定整个地图的操作。

score 2 · Accepted Answer

这有几个方面：首先，大多数集合都不是线程安全的。如果你想要一个线程安全的集合，你可以调用synchronizedCollection或synchronizedMap

但要点是：你希望你的线程并行运行，根本不同步——当然如果可能的话。这是您应该努力的事情，但当然不能在每次处理多线程时都实现。但是使默认的集合/映射线程安全没有意义，因为它应该是共享映射的边缘情况。同步意味着 jvm 需要做更多的工作。

score 1 · Accepted Answer

I have done a little more research and i would say all answers are good. But we can obviously, see why race condition occurs in HashMap. After a bit of research on stackoverflow, i have found these references and they are quiet worth to study the concept further.

I suppose they have clarified my concept.

score 0 · Accepted Answer

In a multithreaded environment, you have to ensure that it is not modified concurrently or you can reach a critical memory problem, because it is not synchronized in any way.

Dear just check Api previously I also thinking in same manner.

I thought that the solution was to use the static Collections.synchronizedMap method. I was expecting it to return a better implementation. But if you look at the source code you will realize that all they do in there is just a wrapper with a synchronized call on a mutex, which happens to be the same map, not allowing reads to occur concurrently.

In the Jakarta commons project, there is an implementation that is called FastHashMap. This implementation has a property called fast. If fast is true, then the reads are non-synchronized, and the writes will perform the following steps:

Clone the current structure
Perform the modification on the clone
Replace the existing structure with the modified clone 
public class FastSynchronizedMap implements Map,   
Serializable {

private final Map m;
private ReentrantReadWriteLock lock = new ReentrantReadWriteLock();

.
.
.

public V get(Object key) {
lock.readLock().lock();
V value = null;
try {
    value = m.get(key);
} finally {
    lock.readLock().unlock();
}
return value;
}

public V put(K key, V value) {
lock.writeLock().lock();
V v = null;
try {
    v = m.put(key, value);
} finally {
    lock.writeLock().lock();
}
return v;
}

.
.
.
}

Note that we do a try finally block, we want to guarantee that the lock is released no matter what problem is encountered in the block.

This implementation works well when you have almost no write operations, and mostly read operations.

score 0 · Accepted Answer

在多线程环境中使用 HashMap 的一种解决方法是使用预期的对象计数对其进行初始化，从而避免重新调整大小的需要。

score 0 · Accepted Answer

当单个线程可以访问它时，可以使用 Hashmap。然而，当多个线程开始访问 Hashmap 时，将出现 2 个主要问题： 1. 无法保证调整 hashmap 的大小以按预期工作。2. 会抛出并发修改异常。当它被单线程访问以同时读取和写入哈希图时，也可能会抛出此错误。

java - 为什么要在多线程环境中使用 HashMap？

6 回答 6

Related

Reference