我希望城市列中的值填充场地列的第一个单词
我尝试使用
df.city.fillna(value=df.venue.str.split()[0])
,但它需要填充第一行值提前谢谢你
从你的DataFrame
:
>>> import pandas as pd
>>> from io import StringIO
>>> df = pd.read_csv(StringIO("""
id,city,venue
2343242,NaN,Sharjah Cricket Stadium
4354534,NaN,Dubai Internationnl Cricket Stadium
4564564,NaN,Dubai Internationnl Cricket Stadium
3454355,NaN,Sharjah Cricket Stadium
5676575,NaN,Sharjah Cricket Stadium"""))
>>> df
id city venue
0 2343242 NaN Sharjah Cricket Stadium
1 4354534 NaN Dubai Internationnl Cricket Stadium
2 4564564 NaN Dubai Internationnl Cricket Stadium
3 3454355 NaN Sharjah Cricket Stadium
4 5676575 NaN Sharjah Cricket Stadium
在split()
您使用之后,我们可以按预期map
将第一个列表元素分配给列中的NaN
值:City
>>> df['city'] = df['city'].fillna(value=df['venue'].str.split().map(lambda x: x[0]))
>>> df
id city venue
0 2343242 Sharjah Sharjah Cricket Stadium
1 4354534 Dubai Dubai Internationnl Cricket Stadium
2 4564564 Dubai Dubai Internationnl Cricket Stadium
3 3454355 Sharjah Sharjah Cricket Stadium
4 5676575 Sharjah Sharjah Cricket Stadium
编辑:
更短,感谢@HenryEcker:
>>> df['city'] = df['city'].fillna(value=df['venue'].str.split().str[0])
>>> df
id city venue
0 2343242 Sharjah Sharjah Cricket Stadium
1 4354534 Dubai Dubai Internationnl Cricket Stadium
2 4564564 Dubai Dubai Internationnl Cricket Stadium
3 3454355 Sharjah Sharjah Cricket Stadium
4 5676575 Sharjah Sharjah Cricket Stadium
您可以使用str.split
with 参数expand=True
将拆分词扩展到不同的列,并将第一列0
输入到.fillna
column 函数中city
,如下所示:
df['city'] = df['city'].fillna(df['venue'].str.split(' ', expand=True)[0])
或拆分为默认列表expand=False
并str[0]
用于获取列表中的第一项:
df['city'] = df['city'].fillna(df['venue'].str.split().str[0])
这样,我们就不需要使用非向量化的 lambda 或应用函数了。
你可以尝试这样的事情:
df['city'] = df.venue.apply(lambda x: x.split()[0])