我偶然发现了一个在原始数组上进行非常简单的 map/reduce 操作的极其不稳定的性能配置文件的实例。这是我的 jmh 基准代码:
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
@OperationsPerInvocation(Measure.ARRAY_SIZE)
@Warmup(iterations = 300, time = 200, timeUnit=MILLISECONDS)
@Measurement(iterations = 1, time = 1000, timeUnit=MILLISECONDS)
@State(Scope.Thread)
@Threads(1)
@Fork(1)
public class Measure
{
static final int ARRAY_SIZE = 1<<20;
final int[] ds = new int[ARRAY_SIZE];
private IntUnaryOperator mapper;
@Setup public void setup() {
setAll(ds, i->(int)(Math.random()*(1<<7)));
final int multiplier = (int)(Math.random()*10);
mapper = d -> multiplier*d;
}
@Benchmark public double multiply() {
return Arrays.stream(ds).map(mapper).sum();
}
}
以下是典型输出的片段:
# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_20.jdk/Contents/Home/jre/bin/java
# VM options: <none>
# Warmup: 300 iterations, 200 ms each
# Measurement: 1 iterations, 1000 ms each
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: org.sample.Measure.multiply
# Run progress: 0,00% complete, ETA 00:01:01
# Fork: 1 of 1
# Warmup Iteration 1: 0,779 ns/op
# Warmup Iteration 2: 0,684 ns/op
# Warmup Iteration 3: 0,608 ns/op
# Warmup Iteration 4: 0,619 ns/op
# Warmup Iteration 5: 0,642 ns/op
# Warmup Iteration 6: 0,638 ns/op
# Warmup Iteration 7: 0,660 ns/op
# Warmup Iteration 8: 0,611 ns/op
# Warmup Iteration 9: 0,636 ns/op
# Warmup Iteration 10: 0,692 ns/op
# Warmup Iteration 11: 0,632 ns/op
# Warmup Iteration 12: 0,612 ns/op
# Warmup Iteration 13: 1,280 ns/op
# Warmup Iteration 14: 7,261 ns/op
# Warmup Iteration 15: 7,379 ns/op
# Warmup Iteration 16: 7,376 ns/op
# Warmup Iteration 17: 7,379 ns/op
# Warmup Iteration 18: 7,195 ns/op
# Warmup Iteration 19: 7,351 ns/op
# Warmup Iteration 20: 7,761 ns/op
....
....
....
# Warmup Iteration 100: 7,300 ns/op
# Warmup Iteration 101: 7,384 ns/op
# Warmup Iteration 102: 7,132 ns/op
# Warmup Iteration 103: 7,278 ns/op
# Warmup Iteration 104: 7,331 ns/op
# Warmup Iteration 105: 7,335 ns/op
# Warmup Iteration 106: 7,450 ns/op
# Warmup Iteration 107: 7,346 ns/op
# Warmup Iteration 108: 7,826 ns/op
# Warmup Iteration 109: 7,221 ns/op
# Warmup Iteration 110: 8,017 ns/op
# Warmup Iteration 111: 7,611 ns/op
# Warmup Iteration 112: 7,376 ns/op
# Warmup Iteration 113: 0,707 ns/op
# Warmup Iteration 114: 0,828 ns/op
# Warmup Iteration 115: 0,608 ns/op
# Warmup Iteration 116: 0,634 ns/op
# Warmup Iteration 117: 0,633 ns/op
# Warmup Iteration 118: 0,660 ns/op
# Warmup Iteration 119: 0,635 ns/op
# Warmup Iteration 120: 0,566 ns/op
关键时刻发生在第 13 和 113 次迭代:首先性能下降十倍,然后恢复。相应的时间是进入测试运行的 2.5 秒和 22.5 秒。这些事件的时间对数组大小非常敏感,顺便说一句。
什么可以解释这种行为?JIT 编译器可能在第一次迭代中就完成了它的工作。没有 GC 动作可言(由 VisualVM 确认)......我对任何解释都完全不知所措。
我的 Java 版本(OS X):
$ java -version
java version "1.8.0_20"
Java(TM) SE Runtime Environment (build 1.8.0_20-b26)
Java HotSpot(TM) 64-Bit Server VM (build 25.20-b23, mixed mode)