python - How does knnimpute work?

Question

From https://stackoverflow.com/a/35684975/4533188 I got that K-Nearest Neighbour Imputation works like this:

For the current observation get the distance to all the other observations.
For each missing value in the current observation, consider all those k nearest observations that have no missing value in the feature in question.
From those feature values of those observations: Calculate the mean (or some similar statistic) - this is the value which is used for the imputation.

The key step is 1: How do we calculate the distance if not all values are available? The post above points towards the Heterogeneous Euclidean-Overlap Metric. However I am interested in the implementation of knn-imputation of fancyimpute. I tracked it back to https://github.com/hammerlab/knnimpute, more specifically https://github.com/hammerlab/knnimpute/blob/master/knnimpute/few_observed_entries.py and I looked at the code. However I am not able to figure out how it works.

Can someone please explain to me, how the knnimpute works there? How is does the distance calculation work here?

score 1 · Accepted Answer

直观地说，“nan-euclidian”距离在可能的情况下计算标准欧几里德距离（并且在两个观测值中的任何一个都缺失的情况下不计算），并线性缩放结果以补偿缺失的条目。

1 回答 1