read.table
withfill=TRUE
可以填写它们。names(DF2)<-
如果漂亮的列名不重要,则可以省略该行。不使用任何包。
# test data
Lines <- "pages count
[page 1, page 2, page 3] 23
[page 2, page 4] 4
[page 1, page 3, page 4] 12"
# code - replace text=Lines with something like "myfile.dat"
DF <- read.table(text = Lines, skip = 1, sep = "]", as.is = TRUE)
DF2 <- read.table(text = DF[[1]], sep = ",", fill = TRUE, as.is = TRUE)
names(DF2) <- paste0(read.table(text = Lines, nrow = 1, as.is = TRUE)[[1]], seq_along(DF2))
DF2$count <- DF[[2]]
DF2[[1]] <- sub(".", "", DF2[[1]]) # remove [
这给出了这个:
> DF2
pages1 pages2 pages3 count
1 page 1 page 2 page 3 23
2 page 2 page 4 4
3 page 1 page 3 page 4 12
注意: 这给出了 page1、page2 等的列标题。如果在问题中准确显示列标题很重要,那么如果页面列少于 20 个,则用使用这些标题的行替换该行。
ord <- c('First', 'Second', 'Third', 'Fourth', 'Fifth', 'Sixth', 'Seventh',
'Eighth', 'Ninth', 'Tenth', 'Eleventh', 'Twelfth', 'Thirteenth',
'Fourteenth', 'Fiftheenth', 'Sixteenth', 'Seventeenth', 'Eighteenth',
'Nineteenth')
ix <- seq_along(DF2)
names(DF2) <- if (ncol(DF2) < 20) paste(ord[ix], "Page") else paste("Page", ix)