语境
我们的应用程序将大量数据存储在内存中的许多不同类型的地图中,以允许快速查找。为了简单起见(不考虑原始地图),它始终是带有一个或多个键的地图。性能对我们来说是一个很大的要求。
问题
我想找到性能最高的地图实现,并按照这里的建议,比较了这些实现:
基于 java.util.HashMap 的 Maps Maps (Nested Maps) 专门用于 3 个键:
Map<K1, Map<K2, Map<K3, V>>>
java.util.HashMap 中的包装键(元组作为键)
Map<Triple<K1, K2, K3>, V>
元组作为 net.openhft.koloboke.collect.map.hash.HashObjObjMap 中的键,根据这个应该是最快的映射(之一)。
HashObjObjMap<Triple<K1, K2, K3>, V>
期望
- 嵌套地图将具有最快的 GET 和最慢的 PUT。
- Koloboke hash map 会比 jdk HashMap 快。
结果
Benchmark Mode Cnt Score Error Units
TupleVsNestedMapsBenchmark.benchGetFromNestedMap avgt 20 11.586 ± 0.205 ns/op
TupleVsNestedMapsBenchmark.benchGetFromTupleKolobokeMap avgt 20 18.619 ± 0.113 ns/op
TupleVsNestedMapsBenchmark.benchGetFromTupleMap avgt 20 8.985 ± 0.085 ns/op
TupleVsNestedMapsBenchmark.benchPutToNestedMap avgt 20 15.106 ± 0.142 ns/op
TupleVsNestedMapsBenchmark.benchPutToTupleKolobokeMap avgt 20 22.533 ± 0.335 ns/op
TupleVsNestedMapsBenchmark.benchPutToTupleMap avgt 20 8.884 ± 0.084 ns/op
基准
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
@OperationsPerInvocation(100000)
@Fork(1)
@Warmup(iterations = 10)
@Measurement(iterations = 20)
public class TupleVsNestedMapsBenchmark {
public static final int N = 10000;
static ObjObjObjObjHashMap<String, String, String, Integer> sourceNestedMap = new ObjObjObjObjHashMap<>();
static Map<Triple<String, String, String>, Integer> sourceTupleMap = new HashMap<>();
static HashObjObjMap<Triple<String, String, String>, Integer> sourceTupleKMap = HashObjObjMaps.newMutableMap();
static {
for (int i = 0; i < N; i++) {
sourceNestedMap.put("a-" + i, "b-" + i, "c-" + i, i);
sourceTupleMap.put(ImmutableTriple.of("a-" + i, "b-" + i, "c-" + i), i);
sourceTupleKMap.put(ImmutableTriple.of("a-" + i, "b-" + i, "c-" + i), i);
}
}
@Benchmark
public List<Integer> benchGetFromNestedMap() {
return benchmarkGet(sourceNestedMap::get);
}
@Benchmark
public List<Integer> benchGetFromTupleMap() {
return benchmarkGet(((key1, key2, key3) -> sourceTupleMap.get(ImmutableTriple.of(key1, key2, key3))));
}
@Benchmark
public List<Integer> benchGetFromTupleKolobokeMap() {
return benchmarkGet(((key1, key2, key3) -> sourceTupleKMap.get(ImmutableTriple.of(key1, key2, key3))));
}
@Benchmark
public ObjObjObjObjHashMap<String, String, String, Integer> benchPutToNestedMap() {
ObjObjObjObjHashMap<String, String, String, Integer> map = new ObjObjObjObjHashMap<>();
benchmarkPut(map::put);
return map;
}
@Benchmark
public Map<Triple<String, String, String>, Integer> benchPutToTupleMap() {
Map<Triple<String, String, String>, Integer> map = new HashMap<>();
benchmarkPut((key1, key2, key3, value) -> map.put(ImmutableTriple.of(key1, key2, key3), value));
return map;
}
@Benchmark
public Map<Triple<String, String, String>, Integer> benchPutToTupleKolobokeMap() {
HashObjObjMap<Triple<String, String, String>, Integer> map = HashObjObjMaps.newMutableMap();
benchmarkPut((key1, key2, key3, value) -> map.put(ImmutableTriple.of(key1, key2, key3), value));
return map;
}
private List<Integer> benchmarkGet(MapValueSupplier<Integer> mapValueSupplier) {
List<Integer> result = new ArrayList<>(N);
for (int i = 0; i < N; i++) {
result.add(mapValueSupplier.supply("a-" + i, "b-" + i, "c-" + i));
}
return result;
}
private void benchmarkPut(PutValueFunction<Integer> putValueFunction) {
for (int i = 0; i < N; i++) {
putValueFunction.apply("a-" + i, "b-" + i, "c-" + i, i);
}
}
private interface MapValueSupplier<T> {
T supply(String key1, String key2, String key3);
}
private interface PutValueFunction<T> {
void apply(String key1, String key2, String key3, T value);
}
}
注意:请不要建议使用原始地图。Integer as (value) 只是廉价对象的一个例子。
问题
- 为什么koloboke地图比jdk地图慢2.5倍?
- 为什么嵌套地图不更快?(我希望元组键对象的分配开销会更大。)
- 还是我的基准错误?那么,我该如何改进呢?
更新
根据@leventov 的好建议,我更改了基准测试并尝试了缓存哈希码(并且具有更好的分布)的 Triple 实现 - 测试被命名为 Tuple2。
@State(Scope.Thread)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
@OperationsPerInvocation(TupleVsNestedMapsBenchmark.TOTAL_OPS)
@Fork(1)
@Warmup(iterations = 5)
@Measurement(iterations = 20)
public class TupleVsNestedMapsBenchmark {
static final int N = 30;
static final int TOTAL_OPS = N * N * N;
private ObjObjObjObjHashMap<String, String, String, Integer> sourceNestedMap;
private Map<Triple<String, String, String>, Integer> sourceTupleMap;
private HashObjObjMap<Triple<String, String, String>, Integer> sourceTupleKMap;
private Map<Triple<String, String, String>, Integer> sourceTuple2Map;
private HashObjObjMap<Triple<String, String, String>, Integer> sourceTuple2KMap;
private String[] keys;
@Setup
public void init() {
sourceNestedMap = new ObjObjObjObjHashMap<>();
sourceTupleMap = new HashMap<>(TOTAL_OPS);
sourceTupleKMap = HashObjObjMaps.newMutableMap(TOTAL_OPS);
sourceTuple2Map = new HashMap<>(TOTAL_OPS);
sourceTuple2KMap = HashObjObjMaps.newMutableMap(TOTAL_OPS);
keys = new String[N];
for (int i = 0; i < N; i++) {
keys[i] = "k" + i;
}
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
for (int k = 0; k < N; k++) {
sourceNestedMap.put(keys[i], keys[j], keys[k], i);
sourceTupleMap.put(ImmutableTriple.of(keys[i], keys[j], keys[k]), i);
sourceTupleKMap.put(ImmutableTriple.of(keys[i], keys[j], keys[k]), i);
sourceTuple2Map.put(ImmutableTriple2.of(keys[i], keys[j], keys[k]), i);
sourceTuple2KMap.put(ImmutableTriple2.of(keys[i], keys[j], keys[k]), i);
}
}
}
}
@Benchmark
public List<Integer> benchGetFromNestedMap() {
return benchmarkGet(sourceNestedMap::get);
}
@Benchmark
public List<Integer> benchGetFromTupleMap() {
return benchmarkGet(((key1, key2, key3) -> sourceTupleMap.get(ImmutableTriple.of(key1, key2, key3))));
}
@Benchmark
public List<Integer> benchGetFromTupleKolobokeMap() {
return benchmarkGet(((key1, key2, key3) -> sourceTupleKMap.get(ImmutableTriple.of(key1, key2, key3))));
}
@Benchmark
public List<Integer> benchGetFromTuple2Map() {
return benchmarkGet(((key1, key2, key3) -> sourceTuple2Map.get(ImmutableTriple2.of(key1, key2, key3))));
}
@Benchmark
public List<Integer> benchGetFromTuple2KolobokeMap() {
return benchmarkGet(((key1, key2, key3) -> sourceTuple2KMap.get(ImmutableTriple2.of(key1, key2, key3))));
}
@Benchmark
public ObjObjObjObjHashMap<String, String, String, Integer> benchPutToNestedMap() {
ObjObjObjObjHashMap<String, String, String, Integer> map = new ObjObjObjObjHashMap<>();
benchmarkPut(map::put);
return map;
}
@Benchmark
public Map<Triple<String, String, String>, Integer> benchPutToTupleMap() {
Map<Triple<String, String, String>, Integer> map = new HashMap<>();
benchmarkPut((key1, key2, key3, value) -> map.put(ImmutableTriple.of(key1, key2, key3), value));
return map;
}
@Benchmark
public Map<Triple<String, String, String>, Integer> benchPutToTupleKolobokeMap() {
HashObjObjMap<Triple<String, String, String>, Integer> map = HashObjObjMaps.newMutableMap();
benchmarkPut((key1, key2, key3, value) -> map.put(ImmutableTriple.of(key1, key2, key3), value));
return map;
}
@Benchmark
public Map<Triple<String, String, String>, Integer> benchPutToTuple2Map() {
Map<Triple<String, String, String>, Integer> map = new HashMap<>();
benchmarkPut((key1, key2, key3, value) -> map.put(ImmutableTriple2.of(key1, key2, key3), value));
return map;
}
@Benchmark
public Map<Triple<String, String, String>, Integer> benchPutToTuple2KolobokeMap() {
HashObjObjMap<Triple<String, String, String>, Integer> map = HashObjObjMaps.newMutableMap();
benchmarkPut((key1, key2, key3, value) -> map.put(ImmutableTriple2.of(key1, key2, key3), value));
return map;
}
private List<Integer> benchmarkGet(MapValueSupplier<Integer> mapValueSupplier) {
List<Integer> result = new ArrayList<>(TOTAL_OPS);
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
for (int k = 0; k < N; k++) {
Integer value = mapValueSupplier.supply(keys[i], keys[j], keys[k]);
result.add(value);
}
}
}
return result;
}
private void benchmarkPut(PutValueFunction<Integer> putValueFunction) {
Integer value = 1;
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
for (int k = 0; k < N; k++) {
putValueFunction.apply(keys[i], keys[j], keys[k], value);
}
}
}
}
private interface MapValueSupplier<T> {
T supply(String key1, String key2, String key3);
}
private interface PutValueFunction<T> {
void apply(String key1, String key2, String key3, T value);
}
}
结果是这样的:
Benchmark Mode Cnt Score Error Units
TupleVsNestedMapsBenchmark.benchGetFromNestedMap avgt 20 24.524 ± 0.144 ns/op
TupleVsNestedMapsBenchmark.benchGetFromTuple2KolobokeMap avgt 20 65.604 ± 1.135 ns/op
TupleVsNestedMapsBenchmark.benchGetFromTuple2Map avgt 20 22.653 ± 0.745 ns/op
TupleVsNestedMapsBenchmark.benchGetFromTupleKolobokeMap avgt 20 34824.901 ± 1718.183 ns/op
TupleVsNestedMapsBenchmark.benchGetFromTupleMap avgt 20 2565.835 ± 57.402 ns/op
TupleVsNestedMapsBenchmark.benchPutToNestedMap avgt 20 43.160 ± 0.340 ns/op
TupleVsNestedMapsBenchmark.benchPutToTuple2KolobokeMap avgt 20 237.300 ± 3.362 ns/op
TupleVsNestedMapsBenchmark.benchPutToTuple2Map avgt 20 40.952 ± 0.535 ns/op
TupleVsNestedMapsBenchmark.benchPutToTupleKolobokeMap avgt 20 52315.769 ± 399.769 ns/op
TupleVsNestedMapsBenchmark.benchPutToTupleMap avgt 20 3205.538 ± 44.306 ns/op
概括
- 如果键类的哈希码函数没有被缓存和/或分布良好,“元组”方法可能会变得非常慢,尤其是对于 koloboke。
- 正如这里所得出的结论(在这个(Obj-Obj)案例中),java.util.HashMap 是“非常”快的。