我正在尝试从以下代码创建 2 个单独的数据框:
import pandas as pd
sport = ('basketball','volleyball','football')
science = ('biology','chemistry','physics')
sportdf = pd.DataFrame(columns = ['Name','Interest'])
sciencedf = pd.DataFrame(columns = ['Name','Interest'])
data = [['tom', 'volleyball'], ['nick', 'chemistry'], ['juli', 'physics']]
df = pd.DataFrame(data, columns = ['Name', 'Interest'])
s = []
q = []
for i in range(len(df)):
if df.loc[i,"Interest"] in sport:
s.append(df.loc[i,"Name"])
s.append(df.loc[i,"Interest"])
df_length = len(s)
sportdf.loc[df_length] = s
print(df)
else:
q.append(df.loc[i,"Name"])
q.append(df.loc[i,"Interest"])
df_length = len(q)
#sciencedf.loc[df_length] = q
预期的输出是 sportdf 数据框将有一行是“tom”和“volleyball”,而 sciencedf 是“nick”“chemistry”和“juli”“physics”。
然而,在上面的代码中,我成功创建了 sportdf,但没有创建 sciencedf,因为列表 q 是 ['nick','chemistry','juli','physics]。我可以用其他方式拆分它然后添加,但我觉得我让这个比实际困难了 100 倍。总结一下:
for every row in df:
if the cell of the 'Interest' column is in the sport tuple:
add the row to the sportdf
if it is not (elif):
add the row to the sciencedf