I'm trying to use a function that calls on the pROC package in R to calculate the area under the curve for a number of different outcomes.
# Function used to compute area under the curve
proc_auc <- function(outcome_var, predictor_var) {
pROC::auc(outcome_var, predictor_var)}
To do this, I am intending to refer to outcome names in a vector (much like below).
# Create a vector of outcome names
outcome <- c('outcome_1', 'outcome_2')
However, I am having problems defining variables to input into this function. When I do this, I generate the error: "Error in roc.default(response, predictor, auc = TRUE, ...): 'response' must have two levels". However, I can't work out why, as I reckon I only have two levels...
I would be so happy if anyone could help me!
Here is a reproducible code from the iris dataset in R.
library(pROC)
library(datasets)
library(dplyr)
# Use iris dataset to generate binary variables needed for function
df <- iris %>% dplyr::mutate(outcome_1 = as.numeric(ntile(Sepal.Length, 4)==4),
outcome_2 = as.numeric(ntile(Petal.Length, 4)==4))%>%
dplyr::rename(predictor_1 = Petal.Width)
# Inspect binary outcome variables
df %>% group_by(outcome_1) %>% summarise(n = n()) %>% mutate(Freq = n/sum(n))
df %>% group_by(outcome_2) %>% summarise(n = n()) %>% mutate(Freq = n/sum(n))
# Function used to compute area under the curve
proc_auc <- function(outcome_var, predictor_var) {
pROC::auc(outcome_var, predictor_var)}
# Create a vector of outcome names
outcome <- c('outcome_1', 'outcome_2')
# Define variables to go into function
outcome_var <- df %>% dplyr::select(outcome[[1]])
predictor_var <- df %>% dplyr::select(predictor_1)
# Use function - first line works but not last line!
proc_auc(df$outcome_1, df$predictor_1)
proc_auc(outcome_var, predictor_var)