我已经快速检查了构建树并查询它与仅计算所有欧几里德距离的性能。如果我在这棵树上查询半径内的所有其他点,它不应该大大优于蛮力方法吗?
有谁知道为什么我的测试代码会产生这些不同的结果?我用错了吗?测试用例不适合 kd-trees 吗?
PS:这是我使用的代码的简化概念验证版本。我还存储和转换结果的完整代码可以在这里找到,但它会产生相同的结果。
进口
import numpy as np
from time import time
from scipy.spatial import KDTree as kd
from functools import reduce
import matplotlib.pyplot as plt
实现
def euclid(c, cs, r):
return ((cs[:,0] - c[0]) ** 2 + (cs[:,1] - c[1]) ** 2 + (cs[:,2] - c[2]) ** 2) < r ** 2
def find_nn_naive(cells, radius):
for i in range(len(cells)):
cell = cells[i]
cands = euclid(cell, cells, radius)
def find_nn_kd_seminaive(cells, radius):
tree = kd(cells)
for i in range(len(cells)):
res = tree.query_ball_point(cells[i], radius)
def find_nn_kd_by_tree(cells, radius):
tree = kd(cells)
return tree.query_ball_tree(tree, radius)
测试设置
min_iter = 5000
max_iter = 10000
step_iter = 1000
rng = range(min_iter, max_iter, step_iter)
elapsed_naive = np.zeros(len(rng))
elapsed_kd_sn = np.zeros(len(rng))
elapsed_kd_tr = np.zeros(len(rng))
ei = 0
for i in rng:
random_cells = np.random.rand(i, 3) * 400.
t = time()
r1 = find_nn_naive(random_cells, 50.)
elapsed_naive[ei] = time() - t
t = time()
r2 = find_nn_kd_seminaive(random_cells, 50.)
elapsed_kd_sn[ei] = time() - t
t = time()
r3 = find_nn_kd_by_tree(random_cells, 50.)
elapsed_kd_tr[ei] = time() - t
ei += 1
阴谋
plt.plot(rng, elapsed_naive, label='naive')
plt.plot(rng, elapsed_kd_sn, label='semi kd')
plt.plot(rng, elapsed_kd_tr, label='full kd')
plt.legend()
plt.show(block=True)