我正在测试不同数据集大小对 SVM 分类器准确性的影响。我正在使用 tidymodelssvm_rbf
和kernlab
。当数据集集达到 5 级时,我maximum iterations reached
在大约一半的运行中收到消息,然后这将变成 4-2 之间的所有消息。这些是我的数据集大小:
这是调整每个级别 100 次的输出:
不幸的是,这不是我可以提供可重现示例的东西,但这是我正在使用的代码:
data_split <- initial_split(final_df, prop = 3/4)
# split 75/25
data_train <- training(data_split)
data_test <- testing(data_split)
data_xv <- mc_cv(data_train)
# defining a recipe to allow specification of variable roles etc and pre-processing
data_recipe <- recipe(species ~ ., data = data_train) %>%
step_normalize(all_numeric())
# specify the svm model we will be using
svm_model <-
svm_rbf(cost = tune(),
rbf_sigma = tune()) %>%
set_mode("classification") %>% # this is not set as we will be tuning later
set_engine("kernlab") # select the engine/package that underlies the model
# now put all together in a workflow
# set the workflow
svm_workflow <- workflow() %>%
add_recipe(data_recipe) %>% # add the recipe
add_model(svm_model) # add the model
doParallel::registerDoParallel(cores = cores)
# extract results
svm_tune_results <- svm_workflow %>%
tune_grid(resamples = data_xv, #CV object
control = control_grid(verbose = FALSE), # grid of values to try
metrics = metric_set(accuracy, roc_auc) # metrics we care about
)
我的问题是;达到最大迭代次数的事实是否是分类的原因,基本上称所有内容为“否”?还是可能有其他原因?
不管怎样,有什么办法可以解决?