1

作为 Python 的初学者,我正在尝试为 pandas DataFrame 中的数据整理任务引用类似的 R-sintax,但这对于 mutate 函数中的 ifelse 语句并不成功。

# R code
df <- data.frame(var1 = c('2020-12-01','2020-12-02',NA,NA,'2020-12-05'), 
                 var2 = c('start','start','start','start','start')
                 stringsAsFactors = F)

df <- df %>% dplyr::mutate(var2 = ifelse(!is.na(var1), 'complete', var1))

关于使用 Python-sintax 获得相同结果的方法的一些建议?

4

1 回答 1

1

尝试numpy.where

df.var2 =  np.where(df.var1.isnull(), np.nan, 'complete')

或者另一个类似的选项base R是创建一个逻辑索引并使用它来代替

i1 = df.var1.isnull() 
df.loc[i1, 'var2'] = np.nan
df.loc[~i1, 'var2'] = 'complete'

-输出

df
#         var1  var2
#0  2020-12-01  complete
#1  2020-12-02  complete
#2         NaN       NaN
#3         NaN       NaN
#4  2020-12-05  complete

数据

import numpy as np
import pandas as pd
df = pd.DataFrame({"var1":['2020-12-01','2020-12-02',np.nan,np.nan,'2020-12-05'],
        "var2": ['start','start','start','start','start']})
于 2021-02-06T20:47:16.760 回答