0

我有一个 .csv 文件,我将其导入 R 以进行一些计算。令我困惑的是我收到的错误消息如下:

Warning Message:
In mean.default(lace$TOTAL.LACE) :
    argument is not numeric or logical: returning NA

我觉得这很奇怪的原因是因为:

> mode(lace$TOTAL.LACE)
[1] "numeric"

所以我完全不知道为什么会这样。

我正在导入的数据如下所示:

ENCOUNTER | MRN | AGE | NAME | LACE.DAYS.SCORE | LACE.ACUTE.IP.SCORE | LACE.ER.SCORE | LACE.COMORBID.SCORE | TOTAL.LACE | FAILURE | LOS | DAYS.TO.FAILURE
123       | 123 | 50  | DOE  | 6               | 3                   | 0             | 1                   | 10         | 0       | 0   | 0

LOS 列是停留时间,只有在 Failure 为 1 时才为非零。Failure 为二进制 1 或 0。Days.To.Failure 也仅在 Failure 为 1 时为非零。

这是我的全部代码:

# Load the requried libraries
library(aod)
library(ggplot2)
library(knitr)

lace <- read.csv("lace for R.csv")
head(lace)
summary(lace)

# This dataset has a binary response called FAILURE. There are four
# predictors of this, Length of Stay, Acute Admission, Comorbidity Score,
# ER visits in the last 6 months. There is also a total.lace coloumn that
# is simply the sum of the four scores

sapply(lace, sd)

# We will no make a two-way contingency table of categorical outcome
#  and predictors we want to make sure there are not 0 cells
xtabs(~FAILURE + TOTAL.LACE, data = lace)

# We now need to convert TOTAL.LACE to a factor to indicate that
# it should be treated as a categorical variable
lace$TOTAL.LACE <- factor(lace$TOTAL.LACE) <-- pos. problem but mode() says numeric

mylogit <- glm(FAILURE ~ TOTAL.LACE, data = lace, family = "binomial")
summary(mylogit)

# We will here use confint to get the confidence intervals of our data
confint(mylogit)
# if there are warnings this will show them
warnings()

# Default confint.default
confint.default(mylogit)

# Wald Test of the TOTAL.LACE (rank)
wald.test(b = coef(mylogit), Sigma = vcov(mylogit), Terms = 2:15)

# Odds Ratios only
exp(coef(mylogit))

# odds ratios and 95% CI
exp(cbind("Odds Ratio" = coef(mylogit), confint(mylogit)))

更新

我被要求运行 dput(lace$TOTAL.LACE) 这是示例输出:

8L, 7L, 5L, 7L, 6L, 6L, 6L, 6L, 4L, 7L, 4L, 5L, 7L, 7L, 6L, 3L, 
...
.Label = c("3", "4", "5", "6", "7", "8", "9", "10", 
"11", "12", "13", "14", "15", "16", "17", "18"), class = "factor")

还要求 str(lace),这里是示例输出:

'data.frame':   17044 obs. of  12 variables:
 $ ENCOUNTER          : int   ...
 $ MRN                : int   ...
 $ AGE                : int  74 87 74 57 52 60 42 84 75 79 ...
 $ NAME               : Factor w/ 11637 levels "doe",..: 3945...
 $ LACE.DAYS.SCORE    : int  6 6 6 6 6 6 6 6 6 6 ...
 $ LACE.ACUTE.IP.SCORE: int  3 3 3 3 3 3 3 3 3 3 ...
 $ LACE.ER.SCORE      : int  0 0 0 0 0 0 0 0 0 0 ...
 $ LACE.COMORBID.SCORE: int  1 0 2 0 0 0 0 0 2 1 ...
 $ TOTAL.LACE         : Factor w/ 16 levels "3","4","5","6",..: 8 7 9 7 7 7 7 ...
 $ FAILURE            : int  0 0 0 0 0 0 0 0 0 0 ...
 $ LOS                : int  0 0 0 0 0 0 0 0 0 0 ...
 $ DAYS.TO.FAILURE    : int  0 0 0 0 0 0 0 0 0 0 ...

谢谢你,

4

1 回答 1

0

您有来自 csv 的红色数据,您可以告诉 R 停止转换stringfactorsstringsAsFactors = F参阅 参考资料?read.csv)。或者,您也可以根据数据简单地使用as.numeric ie回退因子。as.numeric(as.character(lace$TOTAL.LACE))

于 2013-10-16T16:04:30.027 回答