I am using information from a base data.table to pull data from other data.tables as in the following example:
test <- function() {
library(data.table)
test.dt <- data.table(id=c("abc","xyz","ijk"),type=c("1","1","0"),line.position=1:3)
counts.dt <- data.table(
abc=c(10,NA,NA,NA),xyz=c(20,30,NA,NA),ijk=c(10,10,10,10),X2abc=NA,X3abc=1:4)
print(test.dt)
print(counts.dt)
test.dt[,count:=sum(!is.na(counts.dt[[id]])),by=id]
test.dt[,count.value:=counts.dt[line.position,id,with=FALSE],by=id]
print(test.dt)
}
This works fine, and returns the expected result: a column that pulls uses (line.position,id) from a row in test.dt to grab values of counts(line.position,id).
However, I cannot repeat this with a more complex example that pulls data from a worksheet. I get the error: Error in Math.factor(j) : abs not meaningful for factors. This error is thrown right before the last print statement.
test2 <- function(
file.directory="C:/Users/csnyder/Desktop/BootMethod/",
file.name="test.xlsx",
resample.number=3
)
{
require("PBSmapping")
require("xlsx")
library(data.table)
#Load input sheets
file.path<-sprintf("%s%s",file.directory,file.name)
excel.data<-read.xlsx(file.path,sheetIndex=1,header=TRUE,stringsAsFactors=TRUE)
data.DT<-data.table(excel.data)
excel.data<-read.xlsx(file.path,sheetIndex=2,header=TRUE,stringsAsFactors=TRUE)
base.DT<-data.table(excel.data)
excel.data<-read.xlsx(file.path,sheetIndex=3,header=TRUE,stringsAsFactors=TRUE)
related.DT<-data.table(excel.data)
excel.data<-NULL
#add max rows to each ID type. with=TRUE, colnames used as variables.
#get.text<-function(x){return(as.character(x))}
base.DT<-base.DT[,Max.Sample:= sum(!is.na(data.DT[[ID]]),na.rm=TRUE),by=ID]
base.length<-nrow(base.DT)
base.DT[,Sub.Number:=1:base.length]
base.DT[,Resample:=1]
resample.base.DT<-base.DT
#Add line numbers to data tables.
data.DT[,Line:=1:nrow(data.DT)]
related.DT[,Line:=1:nrow(related.DT)]
#resample number added to base DT, then will make a for loop by resample numbers and loop it.
for(counter in 1:resample.number){
base.DT<-rbindlist(list(base.DT,resample.base.DT[,Resample:=counter]))
}
#remove loop initiator
base.DT<-base.DT[-(1:base.length)]
#number rows
base.DT[,Row.Number:=Resample*base.length+Sub.Number-base.length]
#pick line to sample
pick.row<-function(x){return(runif(1,1,x))}
base.DT[,"Line":=runif(1,1,Max.Sample),with=FALSE]
base.DT[,"Line":=round(runif(1,1,Max.Sample),digits=0),by=Row.Number]
#Pull cell from data.DT (and related.DT) that has position corresponding to the matching Row.Number and ID in base.DT
base.DT[,From.Data:=data.DT[Line,ID,with=FALSE],by=ID]
print(base.DT)
}
Now, the sheets from my excel workbook import what looks like (to me at least) the following:
Sheet1:
data.DT<-data.table(item1=c("AAAA","2XXX",780,684,614,39),item2=c("AAAA","XXX",10,314,NA,NA))
Sheet2:
base.DT<-data.table(ID=c("item1","item2"),Level=c("X2XXX","XXX"),Type=c("AAAA","AAAA"),P=c(1000,1000 ),Cat=c("AAAA","AAAA"),Day=c(NA,1))
Sheet3:
related.DT<-data.table(item1=c("AAAA","2XXX",1,1,1,NA),item2=c("AAAA","XXX",1,1,NA,NA))
At my current location, I can't upload a workbook. Replacing the excel imports with the direct calls above seems to fix the problem. At risk of not having a reproducible question, I have to ask: Has anyone run into this problem or have an idea of how to resolve it? Or perhaps I'm going about this in a convoluted way--work-arounds are equally welcomed! If an excel workbook is needed to fully understand my question, let me know and I'll try my best to upload one.