我使用 gbm() 函数来创建模型,我想获得准确性。这是我的代码:
df<-read.csv("http://freakonometrics.free.fr/german_credit.csv", header=TRUE)
str(df)
F=c(1,2,4,5,7,8,9,10,11,12,13,15,16,17,18,19,20,21)
for(i in F) df[,i]=as.factor(df[,i])
library(caret)
set.seed(1000)
intrain<-createDataPartition(y=df$Creditability, p=0.7, list=FALSE)
train<-df[intrain, ]
test<-df[-intrain, ]
install.packages("gbm")
library("gbm")
df_boosting<-gbm(Creditability~.,distribution = "bernoulli", n.trees=100, verbose=TRUE, interaction.depth=4,
shrinkage=0.01, data=train)
summary(df_boosting)
yhat.boost<-predict (df_boosting ,newdata =test, n.trees=100)
mean((yhat.boost-test$Creditability)^2)
但是,使用汇总功能时,会出现错误。错误信息如下。
Error in plot.window(xlim, ylim, log = log, ...) :
유한한 값들만이 'xlim'에 사용될 수 있습니다
In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf
并且,使用均值函数测量MSE时,还会出现以下错误:
Warning message:
In Ops.factor(yhat.boost, test$Creditability) :
요인(factors)에 대하여 의미있는 ‘-’가 아닙니다.
你知道为什么会出现这两个错误吗?先感谢您。