背景:我打算将我编写的库从 C++ 移植到 Java。该代码处理大小为n的d维点列表,并且需要计算标量积等。我想让我的代码独立于点的存储格式,并为此目的引入了一个接口,
public interface PointSetAccessor
{
float coord(int p, int c);
}
这允许我获得第p个点 (0 ≤ p < n ) 的第c个坐标 (0 ≤ c < d ) 。
问题:points[p][c]
由于代码必须非常快,我想知道这points
是否会降低性能,这与直接访问模式(如
令人惊讶的是,情况正好相反:代码(见下文)通过PointSetAccessor
. (我使用它进行time java -server -XX:+AggressiveOpts -cp bin Speedo
了测量,前者大约 14 秒,后者大约 11 秒。)
问题:知道为什么会这样吗?似乎 Hotspot 决定更积极地优化,或者在后一个版本中更自由地这样做?
代码(计算无意义):
public class Speedo
{
public interface PointSetAccessor
{
float coord(int p, int c);
}
public static final class ArrayPointSetAccessor implements PointSetAccessor
{
private final float[][] array;
public ArrayPointSetAccessor(float[][] array)
{
this.array = array;
}
public float coord(int point, int dim)
{
return array[point][dim];
}
}
public static void main(String[] args)
{
final int n = 50000;
final int d = 10;
// Generate n points in dimension d
final java.util.Random r = new java.util.Random(314);
final float[][] a = new float[n][d];
for (int i = 0; i < n; ++i)
for (int j = 0; j < d; ++j)
a[i][j] = r.nextFloat();
float result = 0.0f;
if (true)
{
// Direct version
for (int i = 0; i < n; i++)
for (int j = i + 1; j < n; ++j)
{
float prod = 0.0f;
for (int k = 0; k < d; ++k)
prod += a[i][k] * a[j][k];
result += prod;
}
}
else
{
// Accessor-based version
final PointSetAccessor ac = new ArrayPointSetAccessor(a);
for (int i = 0; i < n; i++)
for (int j = i + 1; j < n; ++j)
{
result += product(ac, d, i, j);
}
}
System.out.println("result = " + result);
}
private final static float product(PointSetAccessor ac, int d, int i, int j)
{
float prod = 0.0f;
for (int k = 0; k < d; ++k)
prod += ac.coord(i, k) * ac.coord(j, k);
return prod;
}
}