我正在尝试应用 smote 函数来平衡我的课程。
这是我的代码:
smote_train <- SMOTE(tested_covid ~., data = dataTrain, k = 5, perc.over = 100, perc.under = 200)
这是我的警告错误:
Error in T[, col] <- data[, col] :
incorrect number of subscripts on matrix
In addition: Warning messages:
1: In if (class(data[, col]) %in% c("factor", "character")) { :
the condition has length > 1 and only the first element will be used
2: In if (class(data[, col]) %in% c("factor", "character")) { :
the condition has length > 1 and only the first element will be used
这是我拥有的数据结构和类型:
structure(list(id = c("ff0113a9-79d4-4042-992f-c5092e30b6af",
"7b104740-c0c2-44bb-82d8-442ea06a3a96", "8533b6e2-bffe-46da-8056-8b77b89a5819",
"21d33ae7-8ad8-4744-8370-d376a7e5d251", "c9225467-8ff1-4305-85ad-6c9386e38347",
"e2e445c4-dffd-4543-b311-efdf2af23744"), age = c(63, 19, 23,
28, 40, 31), gender = c("Male", "Female", "Male", "Female", "Female",
"Male"), country = c("India", "Phillipines", "India", "Phillipines",
"South Africa", "Pakistan"), chills = c("No", "Mild", "No", "Mild",
"No", "No"), Cough = c("No", "Severe", "No", "Mild", "Mild",
"No"), diarrhoea = c("No", "Mild", "No", "No", "No", "No"), fatigue = c("No",
"Moderate", "Mild", "Mild", "Mild", "Mild"), healthcare_worker = c("No",
"No", "No", "No", "No", "Yes"), how_unwell = c(1, 7, 1, 6, 4,
2), comorbidity_one = c("Asthma (managed with an inhaler)", "None",
"Obesity", "High Blood Pressure (hypertension)", "None", "None"
), loss_smell_taste = c("No", "No", "No", "No", "No", "No"),
muscle_ache = c("No", "Moderate", "No", "Moderate", "Mild",
"Mild"), nasal_congestion = c("No", "No", "No", "No", "Mild",
"No"), nausea_vomiting = c("No", "No", "No", "No", "No",
"No"), no_days_symptoms_show = c("None", "4", "None", "More than 21",
"None", "2"), self_diagnosis = c("None", "Mild", "None",
"Mild", "None", "Mild"), shortness_breath = c("No", "Mild",
"No", "No", "No", "Mild"), sore_throat = c("No", "No", "No",
"No", "Mild", "No"), sputum = c("No", "Mild", "No", "Mild",
"Mild", "No"), temperature = c("No", "No", "No", "No", "No",
"37.5-38"), tested_covid = structure(c(1L, 1L, 1L, 1L, 1L,
1L), .Label = c("Negative", "Positive"), class = "factor")), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))