117

I am trying to specify the colClasses options in the read.csv function in R. In my data, the first column time is basically a character vector, while the rest of the columns are numeric.

data <- read.csv("test.csv", comment.char="" , 
                 colClasses=c(time="character", "numeric"), 
                 strip.white=FALSE)

In the above command, I want R to read in the time column as "character" and the rest as numeric. Although the data variable did have the correct result after the command completed, R returned the following warnings. I am wondering how I can fix these warnings?

Warning messages:
1: In read.table(file = file, header = header, sep = sep, quote = quote, : not all columns named in 'colClasses' exist
2: In tmp[i[i > 0L]] <- colClasses : number of items to replace is not a multiple of replacement length

Derek

4

7 回答 7

192

您只能为一列指定 colClasse。

因此,在您的示例中,您应该使用:

data <- read.csv('test.csv', colClasses=c("time"="character"))
于 2011-11-18T16:38:20.823 回答
85

colClasses 向量的长度必须等于导入的列数。假设其余的数据集列是 5:

colClasses=c("character",rep("numeric",5))
于 2010-05-10T18:36:57.770 回答
14

假设您的“时间”列至少有一个非数字字符的观察值,并且所有其他列只有数字,那么“read.csv”的默认值将是“时间”作为“因子”读取,其余的列为“数字”。因此设置 'stringsAsFactors=F' 与手动设置 'colClasses' 的结果相同,即

data <- read.csv('test.csv', stringsAsFactors=F)
于 2010-05-10T23:19:32.360 回答
10

如果您想从标题而不是列号中引用名称,您可以使用如下内容:

fname <- "test.csv"
headset <- read.csv(fname, header = TRUE, nrows = 10)
classes <- sapply(headset, class)
classes[names(classes) %in% c("time")] <- "character"
dataset <- read.csv(fname, header = TRUE, colClasses = classes)
于 2011-12-19T19:53:44.003 回答
8

我知道 OP 询问了该utils::read.csv功能,但让我为那些来这里搜索如何使用readr::read_csvtidyverse 执行此功能的人提供答案。

read_csv ("test.csv", col_names=FALSE, col_types = cols (.default = "c", time = "i"))

这应该将所有列的默认类型设置为character,而time将被解析为整数。

于 2018-09-14T16:41:31.573 回答
4

对于没有标题的多个日期时间列和很多列,假设我的日期时间字段位于第 36 和 38 列,我希望它们作为字符字段读入:

data<-read.csv("test.csv", head=FALSE,   colClasses=c("V36"="character","V38"="character"))                        
于 2017-05-10T21:50:52.500 回答
0

如果我们结合@Hendy 和@Oddysseus Ithaca 贡献的内容,我们将获得更简洁和更通用(即适应性强?)的代码块。

    data <- read.csv("test.csv", head = F, colClasses = c(V36 = "character", V38 = "character"))                        
于 2018-11-02T17:35:21.863 回答