python - 使用 statsmodels.nonparametric.kde 而不是 scipy.stats.gaussian_kde

Question

我读过使用该statsmodels.nonparametric.kde模块而不是scipy.stats.gaussian_kde可以显着提高速度。

我有一个简单的代码块（4 行代码），我目前正在计算使用它，scipy.stats.gaussian_kde我想将它们替换为等效的statsmodels，看看我是否真的可以提高速度。

这是 MWE：

import numpy as np
from scipy import stats

# Generate random two-dimensional data.
def measure(n):
    m1 = np.random.normal(size=n)
    m2 = np.random.normal(scale=0.5, size=n)
    return m1+m2, m1-m2
m1, m2 = measure(20000)

# Define data limits.
xmin, xmax = m1.min(), m1.max()
ymin, ymax = m2.min(), m2.max()

# Format data correctly.
values = np.vstack([m1, m2])

# Define a certain point value.
x1, y1 = 0.5, 0.5

##############
# Replace with calls to statsmodels.nonparametric.kde from here on.

# 1- Perform a kernel density estimate on the data.
kernel = stats.gaussian_kde(values)

# 2- Get kernel value for the point.
iso = kernel((x1,y1))

# 3- Take a random sample from KDE distribution.
sample = kernel.resample(size=1000)

# 4- Filter the sample to keep only values for which
#    the kernel evaluates to less than what it does in the
#    point (x1,y1). This is the most important step to be replaced.
insample = kernel(sample) < iso

可以看出，只有 4 行代码需要替换。不幸的是，文档statsmodels.nonparametric.kde有点差，我不知道如何进行此类替换。

最后一行是最重要的，因为大部分计算时间都花在了这里（如此处所述加速内核估计采样）。

python - 使用 statsmodels.nonparametric.kde 而不是 scipy.stats.gaussian_kde

0 回答 0

Related

Reference