I have a pandas dataframe with a column of integers, which contains some nans. I want to convert them from integer to string, and replace the nans with a description like 'not available'.
The main reason is because I need to run groupbys on that column and, unless I convert the nans, the groupby will get rid of them! Why that even happens, and how the whole pandas community has not risen up in arms, is a totally separate discussion (when I first learnt about it I couldn't believe it...).
I have tried the code below, but it doesn't work. Note that I have tried both astype(str)
and astype('str'
). In both cases the column gets converted to object, not to string; maybe because Python assumes (wrongly, they all have the same length in my dataframe) that the length of the strings varies? But, most importantly, the fillna() doesn't work, and the nans stay nans! Why?
import numpy as np
import pandas as pd
df= pd.DataFrame(np.random.randint(1,10,(10000,5)), columns=['a','b','c','d','e'])
df.iloc[0,0]=np.nan
df['a']=df['a'].astype(str)
df['a']=df['a'].fillna('not available')
print(df.dtypes)
print(df.head())