r - R 中的 LIME 库抛出“错误：响应在排列中是恒定的。请检查您的模型”

Question

寻找一个善良的灵魂来帮助我用我当前的 RF 模型解决 R 中的这个错误：

Error: Response is constant across permutations. Please check your model

以下是运行代码所需的文件：link

这是我的代码：

library("lime")
library("randomForest")
RF <- readRDS("RF_classifier4sRNA.rds") # Load the model

origTrainingData <- read.csv( "training_combined.csv", header = TRUE, sep = ",") # load Orig Training data

origTrainingDataLabels <- read.csv( "training_combined_labels.csv", header = TRUE, sep = "," ) 
                                                        # load Orig Training data labes
Classification <- origTrainingDataLabels$Class
origTrainingDataWithLabels <- cbind(origTrainingData, Classification)

# instances to explain ----
inputFile <- "FeatureTable.tsv"
testData <- read.table( inputFile, sep = "\t", header = TRUE)
class(testData)

testDataPredictions <- predict(RF, testData, type="prob")
testDataPre
# randomForest
# RF <- readRDS("RF_classifier4sRNA.rds")
# pred <- predict(RF, data, type = "prob")

predict_model.randomForest <- function(x, newdata, type, ...) {
  res <- predict(x, newdata = newdata, ...)
  switch(
    type,
    raw = data.frame(Response = res$class, stringsAsFactors = FALSE),
    prob = as.data.frame(res["posterior"], check.names = FALSE)
  )
}

model_type.randomForest <- function(x, ...) 'classification'

?lime()
lime_explainer <- lime( origTrainingData,      # Original training data
                        RF,                    # The model to explain
                        bin_continuous = TRUE, # Should continuous variables be binned 
                                               # when making the explanation
                        n_bins = 5,           # The number of bins for continuous variables 
                                               # if bin_continuous = TRUE
                        quantile_bins = FALSE  # Should the bins be based on n_bins quantiles
                                               # or spread evenly over the range of the training data
                        )
lime_explanations <- explain( testData,           # Data to explain
                              lime_explainer,     # Explainer to use
                              n_labels = 7,
                              n_features = 7,
                              n_permutations = 10,
                              feature_select = "none"
                            )
lime_explanations

公平地说，我不是原始随机森林模型的作者，可以在这里找到：github 以及完整的文档和所有其他相关文件都可以找到（这里）[ https://peerj.com/articles/ 6304/] 我只是想把石灰应用到那个模型上。

score 1 · Accepted Answer

最终，我的教授能够帮助我：D

因此，以下是 LIME 在我的特定用例中的实际功能：

predict_model.randomForest <- function(x, newdata, type, ...) {
  res <- predict(x, newdata = newdata, ...)
  switch(
    type,
    raw = data.frame(Response = ifelse(res[,2] > 0.5, "sRNA", "notSRNA"), 
                     stringsAsFactors = FALSE
    ),
    prob = res 
  )
  print(class(res))
  print(dim(res))
  print(res)
}

model_type.randomForest <- function(x, ...) 'classification'

r - R 中的 LIME 库抛出“错误：响应在排列中是恒定的。请检查您的模型”

1 回答 1

Related

Reference