r - 在 R 中将列联表（.csv 格式）导入为“表”而不是“data.frame”

Question

我正在使用（我认为）公开可用的非常酷的泰坦尼克号数据。

将其导入 R 有两种主要方法：

(1) 您可以使用内置数据集 Titanic( library(datasets)) 或

(2) 您可以将其下载为.csv 文件，例如此处。

现在，数据是聚合频率数据。我想将多维列联表转换为单个级别的数据框。

问题：如果我使用内置数据集，这没问题；但是，如果我使用导入的 .csv 文件，它就不起作用。这是我收到的错误消息：

rep(1:nrow(tablevars), counts) 中的错误：无效的“次”参数此外：警告消息：在 expand.table(Titanic.table) 中：强制引入的 NA

为什么？我做错了什么？非常感谢。

代码

#required packages
library(datasets)
library(epitools)

#(1) Expansion of built-in data set
data(Titanic)    
Titanic.raw <- Titanic
class(Titanic.raw) # data is stored as "table"
Titanic.expand <- expand.table(Titanic.raw)

#(2) Expansion of imported data set
Titanic.raw <- read.table("Titanic.csv", header=TRUE, sep=",", row.names=1)
class(Titanic.raw) #data is stored as "data.frame"

Titanic.table <- as.table(as.matrix(Titanic.raw)) 
class(Titanic.table) #data is stored as "table"

Titanic.expand <- expand.table(Titanic.table)

score 2 · Accepted Answer

我想你可能想要xtabs：注意因子编码对于Titanic和Titanic.new对象中的因子是不同的。默认情况下，因子级别具有字典顺序，而其中两个Titanic因子没有：

 str(Titanic)
 table [1:4, 1:2, 1:2, 1:2] 0 0 35 0 0 0 17 0 118 154 ...
 - attr(*, "dimnames")=List of 4
  ..$ Class   : chr [1:4] "1st" "2nd" "3rd" "Crew"
  ..$ Sex     : chr [1:2] "Male" "Female"
  ..$ Age     : chr [1:2] "Child" "Adult"
  ..$ Survived: chr [1:2] "No" "Yes"

 Titanic.raw <- read.table("~/Downloads/Titanic.csv", header=TRUE, sep=",", row.names=1)

 str( Titanic.new <- 
               xtabs( Freq ~ Class + Sex + Age +Survived, data=Titanic.raw))

 xtabs [1:4, 1:2, 1:2, 1:2] 4 13 89 3 118 154 387 670 0 0 ...
 - attr(*, "dimnames")=List of 4
  ..$ Class   : chr [1:4] "1st" "2nd" "3rd" "Crew"
  ..$ Sex     : chr [1:2] "Female" "Male"
  ..$ Age     : chr [1:2] "Adult" "Child"
  ..$ Survived: chr [1:2] "No" "Yes"
 - attr(*, "class")= chr [1:2] "xtabs" "table"
 - attr(*, "call")= language xtabs(formula = Freq ~ Class + Sex + Age + Survived, data = Titanic.raw)

'xtabs' 对象继承自 'table' 类，因此您可以使用该expand.table功能。

r - 在 R 中将列联表（.csv 格式）导入为“表”而不是“data.frame”

1 回答 1

Related

Reference