r - Look for NA with a condition, then replace NA with a value depends on the condition

Question

I couldn't figure this out....

I have a data frame looks like this (only the top 10 rows are shown):

Value   Type
NA       3      
23       2
54       1
45       1
21       2
55       3
67       3
78       1
10       1
NA       2

Task:

Replace NA with the mean value of its given Type. Ex: The first NA is in Type 3, so I'd like to replace it with the average value in Type 3, that is (55+67)/2= 61

My code:

for (i in 1:nrow(df)){
  if(is.na(df[i,"Value"])==TRUE & Type==1){
    df[i,"Value"] = mean(with(df, subset(Value, Type==1)))
  }
  else if (is.na(df[i,"Value"])==TRUE & Type==2){
    df[i,"Value"] = mean(with(df, subset(Value, Type==2)))
  }
  else if (is.na(df[i,"Value"])==TRUE & Type==3){
    df[i,"Value"] = mean(with(df, subset(Value, Type==3)))
  }
  else (df[i,"Value"] = df[i,"Value"])
}

Result

NAs are still observed in the Value column and they are not being replaced by the mean value of its class.

any help is appreciated!

score 2 · Accepted Answer

library(plyr) 

ddply(dat, .(Type), function(df){
  m <- mean(df$Value, na.rm=TRUE)
  df$Value[is.na(df$Value)] <- m
  df
})

score 0 · Accepted Answer

这是基础 R 中的两行代码，假设X是您的data.frame：

Means <- tapply(X$Value, X$Type, mean, na.rm=TRUE)
X$Value <- apply(X, 1, function(r) ifelse(is.na(r[1]), Means[r[2]], r[1]))

对于大型数据集，可能比使用更快ddply，尽管plyr和data.table包更通用，当然值得学习。

r - Look for NA with a condition, then replace NA with a value depends on the condition

Task:

My code:

Result

2 回答 2

Related

Reference