应用背景
我有一个随机斜率和截距的模型。有许多级别的随机效应。新数据(待预测)可能具有也可能不具有所有这些级别。
为了更具体地说明这一点,我正在处理专辑级别的音乐收入 ( title
)。每张专辑可能有多种类型format2
(CD、黑胶唱片、电子音频等)。我对每种专辑的每张专辑的收入都有衡量标准。模型指定为:
lmer(physical~ format2+ (0+format2|title))
问题是未来的数据可能不具有任何一个title
或的所有级别format2
。对于随机截距,这很容易用predict(..., allow.new.levels= TRUE)
. 但是对于固定效应和随机斜率是有问题的。因此,我正在尝试编写一个函数来对merMod
对象进行灵活的预测,类似于lme4::predict.merMod
; 但这将处理训练数据和预测数据之间的差异。这个问题与lme4::predict.merMod
其他任何事情一样,都是出于对确切细节的无知而提出的。
问题描述 问题
的症结在于model.matrix()
通过固定和随机效应来计算预测和 SE 的正确性。类的 S3 方法只merMod
返回固定效果。
基本stats::model.matrix()
功能的文档非常有限。不幸的是,我既不拥有S 中的统计模型,也不拥有用于数据分析的软件,它们似乎具有这些功能背后的细节。
model.matrix()
应该采用模型公式和新数据框并生成设计矩阵。但我遇到了一个错误。您可以提供的任何帮助将不胜感激。
示例数据
dat1 <- structure(list(dt_scale = c(16, 16, 16, 16, 16, 16, 16, 16, 16,
16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16), title = c("Bahia",
"Jazz Moods: Brazilian Romance", "Quintessence", "Amadeus: The Complete Soundtrack Recording (Bicentennial Edition)",
"Live In Europe", "We'll Play The Blues For You", "The Complete Village Vanguard Recordings, 1961",
"The Isaac Hayes Movement", "Jazz Moods: Jazz At Week's End",
"Blue In Green: The Concert In Canada", "The English Patient - Original Motion Picture Soundtrack",
"The Unique Thelonious Monk", "Since We Met", "You're Gonna Hear From Me",
"The Colors Of Latin Jazz: Cubop!", "The Colors Of Latin Jazz: Samba!",
"Homecoming", "Consecration: The Final Recordings Part 2 - Live At Keystone Korner, September 1980", "More Creedence Gold", "The Stardust Session"), format2 = c("CD", "CD",
"CD", "CD", "CD", "CD", "CD", "SuperAudio", "SuperAudio", "CD", "E Audio", "CD",
"Vinyl", "CD", "E Audio", "CD", "CD", "CD", "CD", "CD"), mf_day = c(TRUE,
TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE,
TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE), xmas = c(FALSE,
FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,
FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,
FALSE), vday = c(FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,
FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,
FALSE, FALSE, FALSE, FALSE), yr_since_rel = c(16.9050969937038,
8.41815617876864, 9.2991404674865, 25.0870296783559, 39.1267038232812,
27.9156764326061, 9.11596751812513, 23.3052837112449, 14.3123922258974,
30.5208152866414, 5.83025071417496, 21.3090003877291, 7.75022155568392,
11.3601605287827, 0.849006673421519, 31.9918631305662, 13.8861905547041,
12.8342695062012, 29.6916671402534, 13.5912612705038), physical = c(1327.17849171096,
-110.2265302258, -795.37376268564, 355.06192702004, -1357.3492884345,
-1254.93442612023, -816.713683621225, 881.201935773452, -3092.02845691036,
-2268.6304275652, 907.347941142021, -699.130275178185, 377.867849132077,
-1047.50531157311, 1460.25978951805, 1376.84579069304, 3619.03629114089,
962.888173535704, 2514.77880599199, 2539.14958588771)), .Names = c("dt_scale",
"title", "format2", "mf_day", "xmas", "vday", "yr_since_rel",
"physical"), row.names = c(1L, 2L, 5L, 6L, 7L, 8L, 9L, 11L, 12L,
13L, 14L, 15L, 20L, 22L, 23L, 25L, 27L, 32L, 35L, 36L), class = "data.frame")
公式:
f1 <- as.formula(~1 + dt_scale + yr_since_rel + format2 + (0 + format2 + mf_day +
xmas + vday | title))
执行/错误
library(lme4)
model.matrix(f1, data= dat1)
Error in 0 + format2 : non-numeric argument to binary operator
注意
我也用Orthodont
数据试过这个;但是,我得到一个不同的错误。
library(lme4)
data("Orthodont",package="MEMSS")
fm1 <- lmer(formula = distance ~ age*Sex + (1+age|Subject), data = Orthodont)
newdat <- expand.grid(
age=c(8,10,12,14)
, Sex=c("Male","Female")
, distance = 0
, Subject= c("F01", "F02")
)
f1 <- formula(fm1)[-2] # simpler code via Ben Bolker below
mm <- model.matrix(f1, newdat) # attempt to use model.matrix
Warning message
In Ops.factor(1 + age, Subject) : | not meaningful for factors
# use lme4:::mkNewReTrms as suggested in comments
mm <- lme4:::mkNewReTrms(f1, newdat)
Error in lme4:::mkNewReTrms(f1, newdat) : object 'ReTrms' not found
In addition: Warning message:
In Ops.factor(1 + age, Subject) : | not meaningful for factors
# check if different syntax would fix this
mm <- lme4::mkNewReTrms(f1, newdat)
Error: 'mkNewReTrms' is not an exported object from 'namespace:lme4'
mm <- mkNewReTrms(f1, newdat)
Error: could not find function "mkNewReTrms"