0

我正在测试不同数据集大小对 SVM 分类器准确性的影响。我正在使用 tidymodelssvm_rbfkernlab。当数据集集达到 5 级时,我maximum iterations reached在大约一半的运行中收到消息,然后这将变成 4-2 之间的所有消息。这些是我的数据集大小:

数据集大小表

这是调整每个级别 100 次的输出:

在此处输入图像描述 不幸的是,这不是我可以提供可重现示例的东西,但这是我正在使用的代码:

data_split <- initial_split(final_df, prop = 3/4)

# split 75/25
data_train <- training(data_split)
data_test <- testing(data_split)
data_xv <- mc_cv(data_train)

# defining a recipe to allow specification of variable roles etc and pre-processing
data_recipe <- recipe(species ~ ., data = data_train) %>%
  step_normalize(all_numeric()) 

# specify the svm model we will be using
svm_model <- 
  svm_rbf(cost = tune(),
          rbf_sigma = tune()) %>%
  set_mode("classification") %>% # this is not set as we will be tuning later
  set_engine("kernlab") # select the engine/package that underlies the model

# now put all together in a workflow
# set the workflow
svm_workflow <- workflow() %>%
  add_recipe(data_recipe) %>% # add the recipe
  add_model(svm_model) # add the model


doParallel::registerDoParallel(cores = cores)

# extract results
svm_tune_results <- svm_workflow %>%
  tune_grid(resamples = data_xv, #CV object
            control = control_grid(verbose = FALSE), # grid of values to try
            metrics = metric_set(accuracy, roc_auc) # metrics we care about
  )

我的问题是;达到最大迭代次数的事实是否是分类的原因,基本上称所有内容为“否”?还是可能有其他原因?

不管怎样,有什么办法可以解决?

4

0 回答 0