python-2.7 - Geo Pandas Data Frame / Matrix - 过滤/删除 NaN / False 值

Question

我将 GeoSeries.almost_equals(other[, decimal=6]) 函数应用于具有 10 00 万个条目的地理数据框，以便找到彼此靠近的多个地理点。：

这给了我矩阵，现在我需要过滤所有 True 值，以便创建只有与地理相关的 POI 的 DF/列表，所以我使用了：

现在，我很难弄清楚如何进一步使用这个矩阵的过滤器。预期的输出是向量、列表或理想情况下的 DF，其所有 TRUE（匹配）值但彼此匹配 re 1 到 1，并重复（如果 [1,9] 则 [9,1] 从输出列表示例中删除：

DF 示例：

score 2 · Accepted Answer

考虑这个示例数据框：

In [1]: df = pd.DataFrame([[True, False, False, True],
   ...: [False, True, True, False],
   ...: [False, True, True, False],
   ...: [True, False, False, True]])

In [2]: df
Out[2]:
       0      1      2      3
0   True  False  False   True
1  False   True   True  False
2  False   True   True  False
3   True  False  False   True

获取匹配索引的数据框的可能解决方案：

首先我np.triu只考虑上三角形（所以你没有重复）：

In [15]: df2 = pd.DataFrame(np.triu(df))

In [16]: df2
Out[16]:
       0      1      2      3
0   True  False  False   True
1  False   True   True  False
2  False  False   True  False
3  False  False  False   True

然后我堆叠数据框，为索引级别指定所需的名称，并仅选择我们具有“真”值的行：

In [17]: result = df2.stack()

In [18]: result
Out[18]:
0  0     True
   1    False
   2    False
   3     True
1  0    False
   1     True
   2     True
   3    False
2  0    False
   1    False
   2     True
   3    False
3  0    False
   1    False
   2    False
   3     True
dtype: bool

In [21]: result.index.names = ['POI_id', 'matched_POI_ids']

In [23]: result[result].reset_index()
Out[23]:
   POI_id  matched_POI_ids     0
0       0                0  True
1       0                3  True
2       1                1  True
3       1                2  True
4       2                2  True
5       3                3  True

然后，您当然可以删除具有 true 的列：.drop(0, axis=1)

python-2.7 - Geo Pandas Data Frame / Matrix - 过滤/删除 NaN / False 值

1 回答 1

Related

Reference