将这种数据结构存储在 pandas 中的惯用方法是什么:
### Option 1
df = pd.DataFrame(data = [
{'kws' : np.array([0,0,0]), 'x' : i, 'y', i} for i in range(10)
])
# df.x and df.y works as expected
# the list and array casting is required because df.kws is
# an array of arrays
np.array(list(df.kws))
# this causes problems when trying to assign as well though:
# for any other data type, this would set all kws in df to the rhs [1,2,3]
# but since the rhs is a list, it tried to do an element-wise assignment and
# errors saying that the length of df and the length of the rhs do not match
df.kws = [1,2,3]
### Option 2
df = pd.DataFrame(data = [
{'kw_0' : 0, 'kw_1' : 0, 'kw_2' : 0, 'x' : i, 'y', i} for i in range(10)
])
# retrieving 2d array:
df[sorted([c for c in df if c.startswith('kw_')])].values
# batch set :
kws = [1,2,3]
for i, kw in enumerate(kws) :
df['kw_'+i] = kw
这些解决方案对我来说都不合适。一方面,它们都不允许在不复制所有数据的情况下检索二维矩阵。有没有更好的方法来处理这种混合维度的数据,或者这只是 pandas 目前还没有完成的任务?