My specific question is why 'model.matrix()' is not working as expected with 'bigglm()'.
The same model matrix works fine in glm:
temp <- model.matrix(~ ORGN_FLOW_INT + work_attract + MINS, data=HTWAF_sample)
sim2 <- glm(FLOW_INT ~ temp, family=poisson(link="log"), data=HTWAF_sample)
This works fine.
But in 'bigglm()', the same commands receive an error:
temp <- model.matrix(~ ORGN_FLOW_INT + work_attract + MINS, data=HTWAF_sample)
sim2 <- bigglm(FLOW_INT ~ temp, family=poisson(link="log"), data=HTWAF_sample)
Error in model.frame.default(tt, chunk): variable lengths differ (found for 'temp')
More generally, I would request examples of how to use 'model.matrix()' along side lm/glm in a conceptually correct way. I have searched for examples all over the web and read the documentation for 'model.matrix()', but I cannot seem to find quality examples of how model.matrix() should be used in conjunction with lm/glm. My own approach above came from trial and error.
顺便说一句,如果您想知道我为什么要使用“model.matrix()”,那是因为我有一个具有 1900 个不同值的因子变量。