1

我有一个这样的数据框:

import pandas as pd

df = pd.DataFrame({'col1': ['abc', 'def', 'tre'],
                   'col2': ['foo', 'bar', 'stuff']})

  col1   col2
0  abc    foo
1  def    bar
2  tre  stuff

和这样的字典:

d = {'col1': [0, 2], 'col2': [1]}

字典包含要从数据框中提取的列名和值的索引以生成如下字符串:

abc (0, col1)

因此,每个字符串都以元素本身开头,并在括号中显示索引和列名。

我尝试了以下列表理解:

l = [f"{df.loc[{indi}, {ci}]} ({indi}, {ci})"
     for ci, vali in d.items()
     for indi in vali]

产生

['  col1\n0  abc (0, col1)',
 '  col1\n2  tre (2, col1)',
 '  col2\n1  bar (1, col2)']

所以,几乎没问题,只是col1\n0需要避免的部分。

如果我尝试

f"{df.loc[0, 'col1']} is great"

我明白了

'abc is great'

然而,根据需要,与

x = 0
f"{df.loc[{x}, 'col1']} is great"

我明白了

'0    abc\nName: col1, dtype: object is great'

这怎么可能解决?

4

2 回答 2

1
import pandas as pd

df = pd.DataFrame({'col1': ['abc', 'def', 'tre'],
                   'col2': ['foo', 'bar', 'stuff']})

d = {'col1': [0, 2], 'col2': [1]}
x = 0
[f"{df.loc[x, 'col1']} is great"
     for ci, vali in d.items()
     for indi in vali]

这给了你:

['abc is great', 'abc is great', 'abc is great']

这是你要找的吗?

您也可以通过 x 范围进行循环

[f"{df.loc[i, 'col1']} is great"
 for ci, vali in d.items()
 for indi in vali
 for i in range(2)]

#output
['abc is great',
 'def is great',
 'abc is great',
 'def is great',
 'abc is great',
 'def is great']
于 2018-10-04T09:59:31.733 回答
1

您所看到的是访问者返回\n的对象的字符串表示形式和丑陋的换行符。pd.Seriesloc

您应该使用pd.DataFrame.at返回标量,并注意这里不需要{}为您的索引标签嵌套:

L = [f'{df.at[indi, ci]} ({indi}, {ci})' \
     for ci, vali in d.items() \
     for indi in vali]

print(L)

['abc (0, col1)', 'tre (2, col1)', 'bar (1, col2)']
于 2018-10-04T10:31:27.043 回答