0

我是编码新手,我正在尝试构建数据树,但我一直遇到同样的错误:

model.frame.default 中的错误(公式 = df ~ df$Open.Closed + df$Region,:变量“df”的类型(列表)无效

我浏览了整个网站,但无法找到解决我问题的有效方法。我尝试了多种解决方案,但我通常会遇到另一个错误,即数据是一个矩阵,该部分不会接受。任何帮助将非常感激。

这是我的代码:

library(rpart.plot)
library(ggExtra)
library(gridExtra)
library(RGtk2)
library(rpart)
library(rattle)
df[] <- data.frame(lapply(Test_Bank_Model,factor))
df [col_names] <- lapply(df[col_names], factor)

str(df)
summary(df)
print(df)


tree <- rpart(df ~ df$Open.Closed + df$Region, data = df, method = "class",
          model = TRUE, control = rpart.control("minsplit" = 1))
rpart.plot(tree, roundint = FALSE, box.palette = "white")
Data:
Region
Closing.Date
Annual.Average.FedFunds
Open.Closed
1   South   2020    0.2328571   Closed
2   Mid West    2020    0.2328571   Closed
3   North East  2020    0.2328571   Open
4   South   2020    0.2328571   Open
5   North East  2020    0.2328571   Open
6   West    2020    0.2328571   Open
7   North East  2020    0.2328571   Open
8   North East  2019    1.7366667   Closed
9   South   2019    1.7366667   Closed
10  Mid West    2019    1.7366667   Closed
4

1 回答 1

0

从错误消息中,我认为您在需要数据框时正在使用列表对象。

lapply以列表形式返回结果。我认为这是格式更改未被注意的地方。

我制作了一个名为“Test_Bank_Model”的数据框,获取了列名并排除了“Annual.Average.FedFunds”转换为一个因子(我不确定你想用这些年做什么)。

您可以像您一样通过rpartdata 参数指定 data.frame 。当你这样做时,你可以节省自己重新输入数据框名称(但我不知道这是有问题的;它也应该工作)。

Test_Bank_Model <- data.frame(Region = c("South","Mid West","North East","South","North East","West","North East","North East","South"),
    Closing.Date = c(rep(2020,7), 2019,2019),
    Annual.Average.FedFunds = c(0.2328571,0.2328571,0.2328571,0.2328571,0.2328571,0.2328571,0.2328571,1.7366667,1.7366667),
    Open.Closed = c("Closed","Closed","Open","Open","Open","Open","Open","Closed","Closed"))

col_names <- colnames(Test_Bank_Model)[-3]

Test_Bank_Model[,col_names] <- as.data.frame(lapply(Test_Bank_Model[,col_names], FUN=as.factor))

str(Test_Bank_Model)
# 'data.frame': 9 obs. of  4 variables:
#  $ Region                 : Factor w/ 4 levels "Mid West","North East",..: 3 1 2 3 2 4 2 2 3
#  $ Closing.Date           : Factor w/ 2 levels "2019","2020": 2 2 2 2 2 2 2 1 1
#  $ Annual.Average.FedFunds: num  0.233 0.233 0.233 0.233 0.233 ...
#  $ Open.Closed            : Factor w/ 2 levels "Closed","Open": 1 1 2 2 2 2 2 1 1

tree <- rpart(Annual.Average.FedFunds ~ Open.Closed + Region,
    data = Test_Bank_Model,
    method = "class",
    model = TRUE,
    control = rpart.control("minsplit" = 1))
rpart.plot(tree, roundint = FALSE, box.palette = "white")

在此处输入图像描述

于 2020-11-01T18:17:21.007 回答