1

I want to use a mask from series x to filter out a vaex dataframe y. I know how to do this in pandas and numpy. In pandas it's like:

import pandas as pd

a = [0,0,0,1,1,1,0,0,0]
b = [4,5,7,8,9,9,0,6,4]

x = pd.Series(a)
y = pd.Series(b)

print(y[x==1])

The result is like:

3    8
4    9
5    9
dtype: int64

But in vaex, the following code doesn't work.

import vaex
import numpy as np

a = np.array([0, 0, 0, 1, 1, 1, 0, 0, 0])
b = np.array([4, 5, 7, 8, 9, 9, 0, 6, 4])

x = vaex.from_arrays(x=a)
y = vaex.from_arrays(x=b)

print(y[x.x == 1].values)

The result is empty:

[]

It seems that vaex doesn't have the same index concept as pandas and numpy. Although the two dataframe is equal shape, array y can't use mask x.x==1.

Is there any way to achieve the equavilent result as pandas does please?

Thanks

4

1 回答 1

1

虽然 Vaex 具有与 Pandas 相似的 API(类似命名的方法,做同样的事情),但两个库的实现完全不同,因此不容易“混合和匹配”。

为了处理任何类型的数据,该数据需要是同一个 Vaex 数据帧的一部分。

所以为了实现你想要的,这样的事情是可能的:

import vaex
import numpy as np

a = np.array([0, 0, 0, 1, 1, 1, 0, 0, 0])
b = np.array([4, 5, 7, 8, 9, 9, 0, 6, 4])

y = vaex.from_arrays(x1=b)
y.add_column(name='x2', f_or_array=a)

print(y[y.x2 == 1])
于 2020-11-05T17:42:10.300 回答