我有以下功能(来自开源项目“重铸导航”):
/// Derives the dot product of two vectors on the xz-plane. (@p u . @p v)
/// @param[in] u A vector [(x, y, z)]
/// @param[in] v A vector [(x, y, z)]
/// @return The dot product on the xz-plane.
///
/// The vectors are projected onto the xz-plane, so the y-values are ignored.
inline float dtVdot2D(const float* u, const float* v)
{
return u[0]*v[0] + u[2]*v[2];
}
我通过 VS2010 CPU 性能测试运行它,它告诉我,在这个函数中的所有重铸代码库代码行中u[0]*v[0] + u[2]*v[2]
CPU 最热。
我如何 CPU 优化(例如通过 SSE 或GLM 之类的 GLSL(如果在这种情况下更容易或更快且合适))这条线?
编辑:调用出现的上下文:
bool dtClosestHeightPointTriangle(const float* p, const float* a, const float* b, const float* c, float& h) {
float v0[3], v1[3], v2[3];
dtVsub(v0, c,a);
dtVsub(v1, b,a);
dtVsub(v2, p,a);
const float dot00 = dtVdot2D(v0, v0);
const float dot01 = dtVdot2D(v0, v1);
const float dot02 = dtVdot2D(v0, v2);
const float dot11 = dtVdot2D(v1, v1);
const float dot12 = dtVdot2D(v1, v2);
// Compute barycentric coordinates
const float invDenom = 1.0f / (dot00 * dot11 - dot01 * dot01);
const float u = (dot11 * dot02 - dot01 * dot12) * invDenom;
const float v = (dot00 * dot12 - dot01 * dot02) * invDenom;