0

如果这是在其他地方道歉(如果我的问题做得不好 - 这是我的第一篇文章)。我已经搜索了几天并解决了所有其他错误,但我不断收到这个错误:“1:knots.vec[num.ctr] 中的错误:NA/NaN 参数”。我试图从可能的 13 个变量中预测一个 4 组分类类(Q72to73_OpportunitySegments),其中 11 个是因子,2 个是数字。我将我的数据 as.data.frame 读取到 R (我事先删除了所有 NA 行)。我的代码适用于示例 Carseats 数据,并且在我不标准化我的两个数值变量(fldAge 和 fldSrvcYrs)时也适用。

这是适用于 Carseats 数据的代码:

library(dplyr)
library(ISLR)
library(knncat)
fix(Carseats) ## 11 vars: 8 continuous, 3 categorical

## move ShelveLoc factor to front of data
Carseats <- Carseats[,c(7,1:6,8:ncol(Carseats))]

## standardize qual vars and drop original qual vars
Carseats_quantvars <- as.data.frame(scale(Carseats[,2:9]))
Carseats_stdzd <- cbind(Carseats[,-(2:9)], Carseats_quantvars); rm(Carseats_quantvars)

set.seed(1)

train = sample(c(TRUE,FALSE), nrow(Carseats_stdzd), rep=TRUE)

knn.pred <- knncat(Carseats_stdzd[train,], Carseats_stdzd[!train,])
knn.pred  ## gives "Test set misclass rate: 48.09%"
knn.pred$vars  ## gives 2 vars used in knncat: Sales, Price

我在我的数据上运行了上面的确切内容并得到了这个:

library(readr)
library(dplyr)
library(knncat)

my_data1 <- read_csv("my_data1.csv", progress=interactive())  ## main datafile

(这有帮助吗?)

Parsed with column specification:
cols(
  Q72to73_OpportunitySegments = col_character(),
  fldSrvcYrs = col_double(),
  ENG_STATE = col_character(),
  fldAge = col_integer(),
  fldGender = col_character(),
  jobclas_13G = col_character(),
  UNIONSTATUS = col_character(),
  APPTSTATUS = col_character(),
  EDUGRP_4G = col_character(),
  DIRECTREPORTS = col_character(),
  JOBSHELD_4G = col_character(),
  JOBSAPPLY_4G = col_character(),
  NEWJOB = col_character(),
  Region_4g = col_character()
)
my_data1 <- my_data1 %>% mutate_if(is.character, factor)
my_data1$fldAge <- as.numeric(my_data1$fldAge)  ## b/c came in as integer

my_data1 <- my_data1[,c(1,2,4,3,5:ncol(my_data1))]
my_data1_quantvars <- as.data.frame(scale(my_data1[,2:3]))
my_data1_quantvars <- rename(my_data1_quantvars, stdzd_SrvcYrs=fldSrvcYrs, stdzd_Age=fldAge)
my_data1_stdzd <- cbind(my_data1[,-(2:3)], my_data1_quantvars); rm(my_data1_quantvars)

set.seed(1)

train = sample(c(TRUE,FALSE), nrow(my_data1), rep=TRUE)

knn.pred <- knncat(my_data1_stdzd[train,], my_data1_stdzd[!train,])

1:knots.vec[num.ctr] 中的错误:NA/NaN 参数

此错误与一个或两个标准化变量有关(当我在未标准化的相同数据上运行相同的代码时,knncat运行)。任何想法如何解决这个问题?(很遗憾,由于《统计法》,我无法分享我的实际数据。)

4

0 回答 0