2

我有一些指示循环模式的因子值:

BM HB HFA HFZ HM HNA HNFA HNFZ HNZ NEA NEZ NWA NWZ NA NZ SA SEA SEZ SWA SWZ SZ TB TM TRM TRW U WA WS WW WZ

其中一个因素,就是所谓的流通模式NA。当我使用该数据时,R 将NA模式解释为缺失值。有没有办法对 R 说这NA是一个合适的值?

这是一些数据示例:

   df <- structure(list(data = structure(list(sec = c(0, 0, 0, 0, 0, 0, 
0, 0, 0, 0), min = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), 
    hour = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), mday = c(21L, 
    21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L), mon = c(0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), year = 46:55, wday = c(1L, 
    2L, 3L, 5L, 6L, 0L, 1L, 3L, 4L, 5L), yday = c(20L, 20L, 20L, 
    20L, 20L, 20L, 20L, 20L, 20L, 20L), isdst = c(0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L)), .Names = c("sec", "min", "hour", 
"mday", "mon", "year", "wday", "yday", "isdst"), class = c("POSIXlt", 
"POSIXt")), gwl = structure(c(4L, NA, 24L, 14L, 4L, 14L, 12L, 
13L, 14L, 2L), .Label = c("", "BM", "HB", "HFA", "HFZ", "HM", 
"HNA", "HNFA", "HNFZ", "HNZ", "NEA", "NEZ", "NWA", "NWZ", "NZ", 
"SA", "SEA", "SEZ", "SWA", "SWZ", "SZ", "TB", "TM", "TRM", "TRW", 
"U", "WA", "WS", "WW", "WZ"), class = "factor")), .Names = c("data", 
"gwl"), row.names = 2546:2555, class = "data.frame")
4

2 回答 2

1

是的,如果您factorNA其介绍为character.

levels(df$gwl) <- c(levels(df$gwl), "NA")
df$gwl[is.na(df$gwl)] <- as.factor("NA")

测试一下:

> table(is.na(df$gwl))

FALSE 
   56 
> table(df$gwl=="NA")

FALSE  TRUE 
   55     1 
于 2013-08-13T17:57:43.580 回答
1

?factor

如果您需要替换 R 中字符向量中的 NA 值,请执行以下操作:

 vec[is.na(vec)] <- "NA"

在您的情况下,它有点复杂,因为它是一个因素,在这种情况下,SeñorO 的答案就添加“NA”级别而言是正确的,尽管我认为不需要该as.factor功能。要理解的重点是“NA”与NA_character_.

在输入数据时,您应该使用 colClasses=c("POSIXct", character") 这样您就不会拥有该POSIXlt列。这将导致您难以理解错误。您应该避免使用POSIXltas data.frame 列类.

于 2013-08-13T18:06:23.490 回答