0

I have two data files consisting of 8 Rows, 2151 columns. I want to do a regression between each file, for each column, and pull out slope, intercept, and r-squared values. Example: do a regression of File 1 Column 1 (all 8 rows) and File 2 Column 1 (all 8 rows), grab the three values of interest (intercept, slope, rsquared), and move on to the next set of columns for both files.

@thelatemail gave me a tremendous piece of code that does nearly everything.

mapply(function(x,y) coef(lm(y~x)), input1, input2

I was hoping to tweak this a bit just so I can extract R2 values from the linear model. So I wrote a quick function just to see if I could replicate the success and go forward.

linear_calibration <- function(x,y) {
   co_values <- coef(lm(y~x))
   return(co_values)
}

test_output = mapply(linear_calibration(input1, input2))
write.table(test_output,file="dump.csv",sep=",")

Unfortunately when I write it this way, I get an error that states:

Error in model.frame.default(formula = y ~ x, drop.unused.levels = TRUE) : 
invalid type (list) for variable 'y'

I'm not really sure why I get an error when I write it out this way. I'm misunderstanding something. To me the long form of what I wrote seems identical to the original one line. But it isn't and so I'm trying to figure out how I can modify the code to make it work.

4

1 回答 1

0

对于您的第一个想法,要让合并以您想要的方式工作,您需要在合并中使用 by 参数。在每个数据框中创建一个 ID 列,假设您将其称为 ID。

input_1$ID <- 1:8
input_2$ID <- 1:8

然后combined <- merge(input_1, input2, by="ID", all.x=TRUE, all.y=TRUE)

关于您的第二个想法,这就是您如何从每个数据框创建同一列的子集并对其运行回归。

df <- cbind(input_1[1], input_2[1])
model <- lm(df[,1] ~ df[,2])

希望有帮助

于 2014-05-19T00:46:03.057 回答