Please explain how NaN's are treated in pandas because the following logic seems "broken" to me, I tried various ways (shown below) to drop the empty values.
My dataframe, which I load from a CSV file using read.csv
, has a column comments
, which is empty most of the time.
The column marked_results.comments
looks like this; all the rest of the column is NaN, so pandas loads empty entries as NaNs, so far so good:
0 VP
1 VP
2 VP
3 TEST
4 NaN
5 NaN
....
Now I try to drop those entries, only this works:
marked_results.comments.isnull()
All these don't work:
marked_results.comments.dropna()
only gives the same column, nothing gets dropped, confusing.marked_results.comments == NaN
only gives a series of allFalse
s. Nothing was NaNs... confusing.- likewise
marked_results.comments == nan
I also tried:
comments_values = marked_results.comments.unique()
array(['VP', 'TEST', nan], dtype=object)
# Ah, gotya! so now ive tried:
marked_results.comments == comments_values[2]
# but still all the results are Falses!!!