我正在按照这个示例在受监督的文本模型上使用石灰https://rdrr.io/github/thomasp85/lime/man/lime.html
我刚刚更改了 get_matrix 函数来创建 dtm。这个新功能适用于此链接示例中的数据,但不适用于我的真实数据。我收到此错误:
Error in glmnet(x[, c(features, j), drop = FALSE], y, weights = weights, : x should be a matrix with 2 or more columns
我使用的代码如下 - 数据和分析仅用于此目的,但复制了我在真实数据上遇到的错误(我有 1000 个文本文档而不是 10 个):
data<-data.frame(articles = c("Prince Harry proposed to Meghan", "Football transfer rumours Chelsea David Luiz", "Football transfer rumours Chelsea David Luiz",
"World Cup team by team guide", "Destiny free trial goes live today", "What happens today ahead of crucial vote",
"Story image for sport news football from BBC Sport", "Premier League news conferences", "What is Meghan Markles engagement ring", "Harry and Megan")
, topic = c("other", "sport", "sport", "sport", "other", "other", "sport", "sport", "other", "other"))
data$articles<-as.character(data$articles)
data$topic<-as.character(data$topic)
data_train<-data[1:6,]
data_test<-data[6:10,]
my_stop_word <-c (stopwords(), "one", "two", "three")
get_matrix <- function(text) {
it <- itoken(text, tolower, progressbar = FALSE)
vocab2 = create_vocabulary(it, stopwords = my_stop_word)
vectorizer = vocab_vectorizer(vocab2)
create_dtm(it, vectorizer = vectorizer)
}
dtm_train = get_matrix(data_train$articles)
xgb_model <- xgb.train(list(max_depth = 7, eta = 0.1, objective = "binary:logistic",
eval_metric = "error", nthread = 1),
xgb.DMatrix(dtm_train, label = data_train$topic == "sport"),
nrounds = 50)
sentences <- head(data_test[data_test$topic == "sport", "articles"], 1)
explainer <- lime(data_test$articles, xgb_model, get_matrix)
explanations <- explain(sentences, explainer, n_labels = 1, n_features = 2)
错误:glmnet(x[, c(features, j), drop = FALSE], y, weights = weights, 中的错误:x 应该是具有 2 列或更多列的矩阵
谢谢!