Statistical inference for the marginal model is discussed and illustrated in this document. Since some things are different with multi-level models and are not readily available in R, I wrote a number of functions that we can use. These are all in a single file called “All_functions.txt” which is available on the course web-site.
library(lme4)
library(lmerTest)
## Warning: replacing previous import 'vctrs::data_frame' by 'tibble::data_frame'
## when loading 'dplyr'
library(texreg)
library(optimx)
library(xtable) # for Latex output of data frame
In R-studio, you can do Code –> Source file and then click on file names “All_functions.txt”. Alternatively you can do this from the console by entering
source('D:/Dropbox/edps587/lectures/6 inference1/All_functions.txt', encoding = 'UTF-8')
This 2nd method is better if you are re-running your code.
Read in the hsb data sets and merge
hsb1 <- read.table("D:\\Dropbox\\edps587\\lectures\\6 inference1\\HSB1.txt", header=TRUE)
hsb2 <- read.table("D:\\Dropbox\\edps587\\lectures\\6 inference1\\HSB2.txt", header=TRUE)
hsb <- merge(hsb1,hsb2, by=c('id'))
Now create variables that we’ll use in our models
# Center ses
hsb$cses <- hsb$ses - hsb$meanses
# This is probably better to use to avoid "warning different scale"
hsb$zsize <- (hsb$size)/sd(hsb$size)
The models starting with simple to more complex.
summary(model1 <- lmer(mathach ~ 1 + cses + female + minority + meanses + zsize + sector + (1 |id), data=hsb, REML=FALSE))
## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
## method [lmerModLmerTest]
## Formula: mathach ~ 1 + cses + female + minority + meanses + zsize + sector +
## (1 | id)
## Data: hsb
##
## AIC BIC logLik deviance df.resid
## 46307.3 46369.3 -23144.7 46289.3 7176
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.2641 -0.7225 0.0410 0.7599 2.9191
##
## Random effects:
## Groups Name Variance Std.Dev.
## id (Intercept) 1.61 1.269
## Residual 35.89 5.991
## Number of obs: 7185, groups: id, 160
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 12.4000 0.3576 172.2298 34.672 < 2e-16 ***
## cses 1.9127 0.1084 7076.1669 17.644 < 2e-16 ***
## female -1.2402 0.1586 5173.8657 -7.821 6.29e-15 ***
## minority -2.8871 0.2009 3610.0411 -14.373 < 2e-16 ***
## meanses 3.9634 0.3324 170.3640 11.924 < 2e-16 ***
## zsize 0.4154 0.1355 159.0317 3.065 0.00256 **
## sector 2.1179 0.2979 153.0838 7.109 4.16e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) cses female minrty meanss zsize
## cses -0.016
## female -0.245 0.050
## minority -0.041 0.153 0.016
## meanses 0.122 0.041 0.048 0.256
## zsize -0.837 -0.013 0.019 -0.089 -0.056
## sector -0.653 -0.023 -0.006 -0.148 -0.360 0.437
summary(model2 <- lmer(mathach ~ 1 + cses + female + minority + meanses + zsize + sector + cses*zsize + (1 |id), data=hsb, REML=FALSE))
## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
## method [lmerModLmerTest]
## Formula: mathach ~ 1 + cses + female + minority + meanses + zsize + sector +
## cses * zsize + (1 | id)
## Data: hsb
##
## AIC BIC logLik deviance df.resid
## 46301.6 46370.4 -23140.8 46281.6 7175
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.3048 -0.7227 0.0349 0.7618 2.9425
##
## Random effects:
## Groups Name Variance Std.Dev.
## id (Intercept) 1.61 1.269
## Residual 35.85 5.987
## Number of obs: 7185, groups: id, 160
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 12.3967 0.3576 172.2227 34.667 < 2e-16 ***
## cses 1.3867 0.2176 7027.2675 6.373 1.97e-10 ***
## female -1.2422 0.1585 5176.8006 -7.838 5.53e-15 ***
## minority -2.8687 0.2009 3608.6721 -14.280 < 2e-16 ***
## meanses 3.9711 0.3324 170.3710 11.948 < 2e-16 ***
## zsize 0.4161 0.1355 159.0309 3.070 0.00252 **
## sector 2.1139 0.2979 153.0911 7.097 4.46e-11 ***
## cses:zsize 0.2909 0.1044 7023.0769 2.787 0.00533 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) cses female minrty meanss zsize sector
## cses -0.005
## female -0.245 0.029
## minority -0.041 0.048 0.015
## meanses 0.122 0.013 0.048 0.256
## zsize -0.837 -0.008 0.019 -0.089 -0.056
## sector -0.653 -0.007 -0.006 -0.148 -0.360 0.437
## cses:zsize -0.003 -0.867 -0.005 0.033 0.008 0.002 -0.005
summary(model3 <- lmer(mathach ~ 1 + cses + female + minority + meanses + zsize + sector
+ cses*zsize + cses*sector + female*meanses + female*zsize + female*sector
+ minority*meanses + minority*zsize + minority*sector
+ (1 + cses|id), data=hsb, REML=FALSE))
## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
## method [lmerModLmerTest]
## Formula: mathach ~ 1 + cses + female + minority + meanses + zsize + sector +
## cses * zsize + cses * sector + female * meanses + female *
## zsize + female * sector + minority * meanses + minority *
## zsize + minority * sector + (1 + cses | id)
## Data: hsb
##
## AIC BIC logLik deviance df.resid
## 46268.7 46399.4 -23115.4 46230.7 7166
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.1998 -0.7180 0.0320 0.7585 3.0068
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 1.6336 1.2781
## cses 0.1567 0.3959 0.08
## Residual 35.5190 5.9598
## Number of obs: 7185, groups: id, 160
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 12.01798 0.43268 318.08997 27.776 < 2e-16 ***
## cses 2.26181 0.31214 151.43779 7.246 2.04e-11 ***
## female -0.35821 0.42171 6761.19516 -0.849 0.395678
## minority -4.08707 0.61827 3074.11945 -6.610 4.50e-11 ***
## meanses 4.23444 0.46291 404.78301 9.147 < 2e-16 ***
## zsize 0.69323 0.17582 359.44252 3.943 9.68e-05 ***
## sector 1.73886 0.38075 308.07750 4.567 7.16e-06 ***
## cses:zsize 0.03215 0.12291 157.93699 0.262 0.794000
## cses:sector -0.99850 0.25375 151.90533 -3.935 0.000126 ***
## female:meanses -0.05433 0.44329 2820.95011 -0.123 0.902460
## female:zsize -0.41608 0.16767 6582.42896 -2.481 0.013109 *
## female:sector -0.23677 0.39242 2664.68437 -0.603 0.546322
## minority:meanses -0.80456 0.49179 2167.15439 -1.636 0.101991
## minority:zsize 0.06961 0.22129 2386.42217 0.315 0.753109
## minority:sector 2.06971 0.49486 3273.08184 4.182 2.96e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation matrix not shown by default, as p = 15 > 12.
## Use print(x, correlation=TRUE) or
## vcov(x) if you need it
This required some fineshing to get is to fit (well sort of)
summary(model4 <- lmer(mathach ~ 1 + female + minority + cses + meanses + zsize
+ sector + cses*zsize + cses*sector + female*meanses
+ female*zsize + female*sector + minority*meanses
+ minority*zsize + minority*sector
+ (1 + female + minority + cses |id), data=hsb, REML=FALSE,
control= lmerControl(
check.conv.singular= .makeCC(action = "message", tol = 1e-4),
check.conv.hess = .makeCC(action = "warning", tol = 1e-6),
check.conv.grad = .makeCC("warning", tol = .002, relTol = NULL))))
## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
## method [lmerModLmerTest]
## Formula: mathach ~ 1 + female + minority + cses + meanses + zsize + sector +
## cses * zsize + cses * sector + female * meanses + female *
## zsize + female * sector + minority * meanses + minority *
## zsize + minority * sector + (1 + female + minority + cses | id)
## Data: hsb
## Control: lmerControl(check.conv.singular = .makeCC(action = "message",
## tol = 1e-04), check.conv.hess = .makeCC(action = "warning",
## tol = 1e-06), check.conv.grad = .makeCC("warning", tol = 0.002,
## relTol = NULL))
##
## AIC BIC logLik deviance df.resid
## 46272.8 46451.7 -23110.4 46220.8 7159
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.2037 -0.7149 0.0334 0.7609 2.9991
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 2.2268 1.4923
## female 0.7022 0.8380 -0.76
## minority 0.8756 0.9358 -0.12 0.24
## cses 0.1446 0.3803 0.26 -0.32 -0.35
## Residual 35.3183 5.9429
## Number of obs: 7185, groups: id, 160
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 12.02587 0.46923 139.24621 25.629 < 2e-16 ***
## female -0.34012 0.46377 123.25042 -0.733 0.464723
## minority -4.20661 0.67149 157.13987 -6.265 3.43e-09 ***
## cses 2.28237 0.31118 148.15982 7.335 1.35e-11 ***
## meanses 4.22063 0.50040 181.84030 8.435 1.01e-14 ***
## zsize 0.67978 0.18960 156.33377 3.585 0.000449 ***
## sector 1.73592 0.41738 134.29183 4.159 5.67e-05 ***
## cses:zsize 0.01961 0.12264 154.94416 0.160 0.873183
## cses:sector -1.00330 0.25288 147.66573 -3.967 0.000113 ***
## female:meanses -0.03191 0.48383 165.52237 -0.066 0.947498
## female:zsize -0.42547 0.18382 132.57553 -2.315 0.022174 *
## female:sector -0.30052 0.42847 142.90967 -0.701 0.484214
## minority:meanses -0.77917 0.53918 120.09086 -1.445 0.151030
## minority:zsize 0.11075 0.24035 141.49886 0.461 0.645657
## minority:sector 2.11884 0.54302 132.65245 3.902 0.000151 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation matrix not shown by default, as p = 15 > 12.
## Use print(x, correlation=TRUE) or
## vcov(x) if you need it
We can relax convergenece criteria on grad to make it look converged; for example set tol = .002 to tol=.007.
There are other things we do such as check.conv.singular which will check whether hat(T) is a good covariance matrix, i.e., “rules for checking for a singular fit, i.e. one where some parameters are on the boundary of the feasible space (for example, random effects variances equal to 0 or correlations between random effects equal to +/- 1.0)”
Other things that you can try:
* control= lmerControl(optimizer = “nloptwrap”,restart_edge = FALSE,boundary.tol = 0) * control= lmerControl(optimizer = “Nelder-Mead”) * check to make sure model is not misspecified * Try package brms and use Bayesian estimation
To me this model looks fine:
relgrad <- with(model4@optinfo$derivs,solve(Hessian,gradient))
max(abs(relgrad))
## [1] 0.0001473563
isSingular(model4)
## [1] FALSE
This pulls the t-statistic out from summary and squares it to get wald statistic, which is chi-square distribution
s4 <- summary(model4)
s4 <- as.data.frame(s4[10])
names(s4)<- c("Estimate","StdError","df t","t","Pr(>|t|)")
s4$df.Wald <- rep(1,nrow(s4))
s4$Wald4 <- s4$t**2
For nicer output
options(scipen = 999) # turn scientific notation off
print(s4, type = "html", digits=2)
## Estimate StdError df t t
## (Intercept) 12.026 0.47 139 25.629
## female -0.340 0.46 123 -0.733
## minority -4.207 0.67 157 -6.265
## cses 2.282 0.31 148 7.335
## meanses 4.221 0.50 182 8.435
## zsize 0.680 0.19 156 3.585
## sector 1.736 0.42 134 4.159
## cses:zsize 0.020 0.12 155 0.160
## cses:sector -1.003 0.25 148 -3.967
## female:meanses -0.032 0.48 166 -0.066
## female:zsize -0.425 0.18 133 -2.315
## female:sector -0.301 0.43 143 -0.701
## minority:meanses -0.779 0.54 120 -1.445
## minority:zsize 0.111 0.24 141 0.461
## minority:sector 2.119 0.54 133 3.902
## Pr(>|t|)
## (Intercept) 0.0000000000000000000000000000000000000000000000000000014
## female 0.4647226089426657047454227722482755780220031738281250000
## minority 0.0000000034287565763701115684886078227577854704577475786
## cses 0.0000000000134818010041399007721205083321081019676057622
## meanses 0.0000000000000100772051822234072157059847629767546095536
## zsize 0.0004494250134958362175030543994580511935055255889892578
## sector 0.0000566599902953020270572614958126678175176493823528290
## cses:zsize 0.8731830626393966010567737612291239202022552490234375000
## cses:sector 0.0001129667428675000062730099381624881971220020204782486
## female:meanses 0.9474975169778295791545019710611086338758468627929687500
## female:zsize 0.0221735294176353507633603356907769921235740184783935547
## female:sector 0.4842140231989182641569868792430497705936431884765625000
## minority:meanses 0.1510295365356642827148192509412183426320552825927734375
## minority:zsize 0.6456569852918154772680736641632393002510070800781250000
## minority:sector 0.0001510460284242881817771270158701213404128793627023697
## df.Wald Wald4
## (Intercept) 1 656.8352
## female 1 0.5378
## minority 1 39.2452
## cses 1 53.7963
## meanses 1 71.1410
## zsize 1 12.8550
## sector 1 17.2977
## cses:zsize 1 0.0256
## cses:sector 1 15.7404
## female:meanses 1 0.0043
## female:zsize 1 5.3572
## female:sector 1 0.4919
## minority:meanses 1 2.0884
## minority:zsize 1 0.2123
## minority:sector 1 15.2250
options(scipen = 0) # turn scientific notation on
Three different methods are given here: model based, bootstrap and profile.
Recall that s4 is the summary from model 4. To get the names of what you can extract from s3, type {r, namess3] names(s4)
We will compute confidence intervales based on model estimated standarad errors,
s4$upper <- s4$Estimate - qnorm(.025)*s4$StdError
s4$lower <- s4$Estimate - qnorm(.975)*s4$StdError
s4
## Estimate StdError df t t Pr(>|t|)
## (Intercept) 12.02586560 0.4692326 139.2462 25.62879656 1.424157e-54
## female -0.34011821 0.4637714 123.2504 -0.73337465 4.647226e-01
## minority -4.20661089 0.6714890 157.1399 -6.26460152 3.428757e-09
## cses 2.28236851 0.3111786 148.1598 7.33459427 1.348180e-11
## meanses 4.22062606 0.5003994 181.8403 8.43451430 1.007721e-14
## zsize 0.67977608 0.1895961 156.3338 3.58539124 4.494250e-04
## sector 1.73592150 0.4173835 134.2918 4.15905625 5.665999e-05
## cses:zsize 0.01960739 0.1226374 154.9442 0.15988098 8.731831e-01
## cses:sector -1.00329861 0.2528843 147.6657 -3.96742213 1.129667e-04
## female:meanses -0.03190838 0.4838296 165.5224 -0.06594963 9.474975e-01
## female:zsize -0.42546982 0.1838225 132.5755 -2.31456919 2.217353e-02
## female:sector -0.30051640 0.4284719 142.9097 -0.70136785 4.842140e-01
## minority:meanses -0.77917164 0.5391757 120.0909 -1.44511649 1.510295e-01
## minority:zsize 0.11074875 0.2403456 141.4989 0.46078961 6.456570e-01
## minority:sector 2.11883785 0.5430243 132.6525 3.90192082 1.510460e-04
## df.Wald Wald4 upper lower
## (Intercept) 1 6.568352e+02 12.9455445 11.1061867
## female 1 5.378384e-01 0.5688571 -1.2490935
## minority 1 3.924523e+01 -2.8905167 -5.5227051
## cses 1 5.379627e+01 2.8922673 1.6724697
## meanses 1 7.114103e+01 5.2013909 3.2398612
## zsize 1 1.285503e+01 1.0513775 0.3081746
## sector 1 1.729775e+01 2.5539781 0.9178648
## cses:zsize 1 2.556193e-02 0.2599723 -0.2207576
## cses:sector 1 1.574044e+01 -0.5076546 -1.4989426
## female:meanses 1 4.349353e-03 0.9163802 -0.9801969
## female:zsize 1 5.357231e+00 -0.0651844 -0.7857552
## female:sector 1 4.919169e-01 0.5392731 -1.1403059
## minority:meanses 1 2.088362e+00 0.2775932 -1.8359365
## minority:zsize 1 2.123271e-01 0.5818175 -0.3603200
## minority:sector 1 1.522499e+01 3.1831459 1.0545298
round(s4[,8:9],digits=2)
## upper lower
## (Intercept) 12.95 11.11
## female 0.57 -1.25
## minority -2.89 -5.52
## cses 2.89 1.67
## meanses 5.20 3.24
## zsize 1.05 0.31
## sector 2.55 0.92
## cses:zsize 0.26 -0.22
## cses:sector -0.51 -1.50
## female:meanses 0.92 -0.98
## female:zsize -0.07 -0.79
## female:sector 0.54 -1.14
## minority:meanses 0.28 -1.84
## minority:zsize 0.58 -0.36
## minority:sector 3.18 1.05
This can take a long time….
boot.ci <- confint(model1, method="boot", seed=1125, nsim=1000, level=0.95)
## Computing bootstrap confidence intervals ...
round(boot.ci,digits=4)
## 2.5 % 97.5 %
## .sig01 1.0155 1.4559
## .sigma 5.8930 6.0834
## (Intercept) 11.7390 13.1435
## cses 1.7063 2.1080
## female -1.5455 -0.9280
## minority -3.2955 -2.4970
## meanses 3.2806 4.6013
## zsize 0.1380 0.6793
## sector 1.5392 2.7345
This doesn’t take as much time as bootstrap,
profile.ci<- confint(model1, nsim=1000, level=0.95)
## Computing profile confidence intervals ...
round(profile.ci,digits=4)
## 2.5 % 97.5 %
## .sig01 1.0662 1.4994
## .sigma 5.8930 6.0912
## (Intercept) 11.6924 13.1029
## cses 1.7002 2.1252
## female -1.5513 -0.9288
## minority -3.2812 -2.4934
## meanses 3.3093 4.6198
## zsize 0.1489 0.6836
## sector 1.5308 2.7060
summary(model5 <- lmer(mathach ~ 1 + female + minority + cses + meanses + zsize + sector
+ cses*zsize + cses*sector + female*meanses + female*zsize + female*sector
+ minority*meanses + minority*zsize + minority*sector
+ (1 + female + cses |id), data=hsb, REML=FALSE) )
## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
## method [lmerModLmerTest]
## Formula: mathach ~ 1 + female + minority + cses + meanses + zsize + sector +
## cses * zsize + cses * sector + female * meanses + female *
## zsize + female * sector + minority * meanses + minority *
## zsize + minority * sector + (1 + female + cses | id)
## Data: hsb
##
## AIC BIC logLik deviance df.resid
## 46268.5 46419.8 -23112.2 46224.5 7163
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.1742 -0.7150 0.0297 0.7602 2.9993
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 2.2491 1.4997
## female 0.6621 0.8137 -0.69
## cses 0.1436 0.3789 0.22 -0.49
## Residual 35.4175 5.9513
## Number of obs: 7185, groups: id, 160
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 12.00065 0.47017 166.45639 25.524 < 2e-16 ***
## female -0.33177 0.46182 123.22654 -0.718 0.473866
## minority -4.13739 0.61408 2714.13003 -6.738 1.96e-11 ***
## cses 2.27126 0.31098 149.76016 7.304 1.54e-11 ***
## meanses 4.20243 0.49905 206.80067 8.421 6.23e-15 ***
## zsize 0.69898 0.18989 188.84073 3.681 0.000303 ***
## sector 1.74109 0.41675 152.12986 4.178 4.94e-05 ***
## cses:zsize 0.02701 0.12246 156.09767 0.221 0.825712
## cses:sector -0.99632 0.25269 149.68479 -3.943 0.000123 ***
## female:meanses -0.02628 0.47965 167.01772 -0.055 0.956372
## female:zsize -0.42767 0.18305 132.50884 -2.336 0.020976 *
## female:sector -0.28671 0.42591 143.17928 -0.673 0.501926
## minority:meanses -0.83179 0.48772 1890.00909 -1.705 0.088273 .
## minority:zsize 0.09301 0.21930 2043.69480 0.424 0.671511
## minority:sector 2.10585 0.49283 3073.13947 4.273 1.99e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation matrix not shown by default, as p = 15 > 12.
## Use print(x, correlation=TRUE) or
## vcov(x) if you need it
We might want to take a step back, and summarize what we’re done so far,
## Summary of models fit so far
screenreg(list(model1,model2,model3,model4,model5),
custom.model.names=c("Model 1", "Model 2","Model 3","Model 4", "Model 5") )
##
## =======================================================================================================
## Model 1 Model 2 Model 3 Model 4 Model 5
## -------------------------------------------------------------------------------------------------------
## (Intercept) 12.40 *** 12.40 *** 12.02 *** 12.03 *** 12.00 ***
## (0.36) (0.36) (0.43) (0.47) (0.47)
## cses 1.91 *** 1.39 *** 2.26 *** 2.28 *** 2.27 ***
## (0.11) (0.22) (0.31) (0.31) (0.31)
## female -1.24 *** -1.24 *** -0.36 -0.34 -0.33
## (0.16) (0.16) (0.42) (0.46) (0.46)
## minority -2.89 *** -2.87 *** -4.09 *** -4.21 *** -4.14 ***
## (0.20) (0.20) (0.62) (0.67) (0.61)
## meanses 3.96 *** 3.97 *** 4.23 *** 4.22 *** 4.20 ***
## (0.33) (0.33) (0.46) (0.50) (0.50)
## zsize 0.42 ** 0.42 ** 0.69 *** 0.68 *** 0.70 ***
## (0.14) (0.14) (0.18) (0.19) (0.19)
## sector 2.12 *** 2.11 *** 1.74 *** 1.74 *** 1.74 ***
## (0.30) (0.30) (0.38) (0.42) (0.42)
## cses:zsize 0.29 ** 0.03 0.02 0.03
## (0.10) (0.12) (0.12) (0.12)
## cses:sector -1.00 *** -1.00 *** -1.00 ***
## (0.25) (0.25) (0.25)
## female:meanses -0.05 -0.03 -0.03
## (0.44) (0.48) (0.48)
## female:zsize -0.42 * -0.43 * -0.43 *
## (0.17) (0.18) (0.18)
## female:sector -0.24 -0.30 -0.29
## (0.39) (0.43) (0.43)
## minority:meanses -0.80 -0.78 -0.83
## (0.49) (0.54) (0.49)
## minority:zsize 0.07 0.11 0.09
## (0.22) (0.24) (0.22)
## minority:sector 2.07 *** 2.12 *** 2.11 ***
## (0.49) (0.54) (0.49)
## -------------------------------------------------------------------------------------------------------
## AIC 46307.34 46301.58 46268.70 46272.84 46268.49
## BIC 46369.26 46370.37 46399.42 46451.72 46419.84
## Log Likelihood -23144.67 -23140.79 -23115.35 -23110.42 -23112.24
## Num. obs. 7185 7185 7185 7185 7185
## Num. groups: id 160 160 160 160 160
## Var: id (Intercept) 1.61 1.61 1.63 2.23 2.25
## Var: Residual 35.89 35.85 35.52 35.32 35.42
## Var: id cses 0.16 0.14 0.14
## Cov: id (Intercept) cses 0.04 0.15 0.13
## Var: id female 0.70 0.66
## Var: id minority 0.88
## Cov: id (Intercept) female -0.95 -0.84
## Cov: id (Intercept) minority -0.17
## Cov: id female minority 0.19
## Cov: id female cses -0.10 -0.15
## Cov: id minority cses -0.13
## =======================================================================================================
## *** p < 0.001; ** p < 0.01; * p < 0.05
anova(model1,model2)
## Data: hsb
## Models:
## model1: mathach ~ 1 + cses + female + minority + meanses + zsize + sector +
## model1: (1 | id)
## model2: mathach ~ 1 + cses + female + minority + meanses + zsize + sector +
## model2: cses * zsize + (1 | id)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## model1 9 46307 46369 -23145 46289
## model2 10 46302 46370 -23141 46282 7.7638 1 0.00533 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The function computes test statistic for
Ho: LGamma = 0 Ha: LGamma <> 0
Where L is an (c X p) matirx of c contrast on the p fixed effects Gamma is a (p X 1) vector of fixed effects
Input:
Output: * F statistics * Numerator df * Guess at denominator df * p-value for F * X2 (chi-square test statistic) * df * p-value for X2
Requirements: Model should be fit by lmer with lmerTest (the latter is used as a guess at den df for F).
Note that the order of elements in contrast matrix (columns) is the same as the order in the model that you input
cmodel <- lmer(mathach ~ 1 + cses + female + minority + meanses + zsize + sector
+ cses*zsize + cses*sector + female*meanses + female*zsize
+ female*sector + minority*meanses + minority*zsize
+ minority*sector
+ (1 + cses + female | id), data=hsb, REML=FALSE)
summary(cmodel)
## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
## method [lmerModLmerTest]
## Formula: mathach ~ 1 + cses + female + minority + meanses + zsize + sector +
## cses * zsize + cses * sector + female * meanses + female *
## zsize + female * sector + minority * meanses + minority *
## zsize + minority * sector + (1 + cses + female | id)
## Data: hsb
##
## AIC BIC logLik deviance df.resid
## 46268.5 46419.8 -23112.2 46224.5 7163
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.1742 -0.7150 0.0297 0.7602 2.9993
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 2.2497 1.4999
## cses 0.1437 0.3791 0.22
## female 0.6628 0.8141 -0.69 -0.49
## Residual 35.4173 5.9512
## Number of obs: 7185, groups: id, 160
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 12.00063 0.47021 166.42528 25.522 < 2e-16 ***
## cses 2.27127 0.31099 149.77952 7.303 1.54e-11 ***
## female -0.33175 0.46186 123.20631 -0.718 0.473932
## minority -4.13740 0.61408 2714.18953 -6.738 1.96e-11 ***
## meanses 4.20244 0.49909 206.75627 8.420 6.26e-15 ***
## zsize 0.69899 0.18991 188.80267 3.681 0.000303 ***
## sector 1.74107 0.41679 152.10249 4.177 4.95e-05 ***
## cses:zsize 0.02701 0.12246 156.11671 0.221 0.825740
## cses:sector -0.99632 0.25270 149.70315 -3.943 0.000123 ***
## female:meanses -0.02626 0.47969 166.98281 -0.055 0.956409
## female:zsize -0.42768 0.18307 132.48531 -2.336 0.020984 *
## female:sector -0.28671 0.42594 143.15476 -0.673 0.501952
## minority:meanses -0.83175 0.48772 1889.99079 -1.705 0.088288 .
## minority:zsize 0.09301 0.21930 2043.69457 0.424 0.671509
## minority:sector 2.10585 0.49283 3073.21670 4.273 1.99e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation matrix not shown by default, as p = 15 > 12.
## Use print(x, correlation=TRUE) or
## vcov(x) if you need it
The constrast function takes as input the model object and a matrix of constrast coefficients of size (number of effects x number of effects). Note that the order of elements in contrast matrix (columns) is the same as the order in the model that you input. The function returns, F statistics, numerator degrees of freedom, a guess at the denominator degrees of freedom, a Wald chi-square statistics, degrees of freedom and p-values.
To get tests for example H_o: gamma_10=0,, the constrast matrix “L” is a (1 x 15) matrix with all elementes equal to 0 except the 2nd one. This means that each fixed effect will be tested (these are the 1’s on the diagonal)
L1 <- matrix(0,nrow=1,ncol=15)
L1[1,2] <- 1
contrast(cmodel,L1)
## F num df den df guess p-value Wald df
## 5.334023e+01 1.000000e+00 1.497795e+02 1.542572e-11 5.334023e+01 1.000000e+00
## p-value
## 2.805026e-13
To test whether the coeffieicent for cses and meanses are equal,
L2 <- matrix(0,nrow=1, ncol=15)
L2[1,2] <- 1
L2[1,5]
## [1] 0
L2
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
## [1,] 0 1 0 0 0 0 0 0 0 0 0 0 0 0
## [,15]
## [1,] 0
round(contrast(cmodel,L2),digits=2)
## F num df den df guess p-value Wald df
## 53.34 1.00 149.78 0.00 53.34 1.00
## p-value
## 0.00
For a test of all the crossed-level effects, need to redefine L,
L <- matrix(0, nrow=5, ncol=15)
L[5,14] <- 1
L[4,13] <- 1
L[3,11] <- 1
L[2,10] <- 1
L[1, 8] <- 1
L
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
## [1,] 0 0 0 0 0 0 0 1 0 0 0 0 0 0
## [2,] 0 0 0 0 0 0 0 0 0 1 0 0 0 0
## [3,] 0 0 0 0 0 0 0 0 0 0 1 0 0 0
## [4,] 0 0 0 0 0 0 0 0 0 0 0 0 1 0
## [5,] 0 0 0 0 0 0 0 0 0 0 0 0 0 1
## [,15]
## [1,] 0
## [2,] 0
## [3,] 0
## [4,] 0
## [5,] 0
round(contrast(cmodel,L),digits=2)
## F num df den df guess p-value Wald df
## 1.73 5.00 156.12 0.13 8.66 5.00
## p-value
## 0.12
After a model has been fit, sandwich/robust/emprical standard errors for fixed effects and be computed usin the function ‘robust’. The function reports both the model and sandwich se as well as t-tests for them. If the model is good, the sandwich and model based should be very similar (and results of t-test should be the same).
The basic syntax is
robust(inmodel,response,idvar,dftype)
where
r2 <- robust(model2, hsb$mathach, hsb$id, "satterthwaite")
round(r2, digits=4)
## Fixed Est. satterthwaite Model se. Model t p-value Robust se
## (Intercept) 12.3967 172.2227 0.3576 34.6672 0.0000 0.3725
## cses 1.3867 7027.2675 0.2176 6.3727 0.0000 0.2459
## female -1.2422 5176.8006 0.1585 -7.8378 0.0000 0.1726
## minority -2.8687 3608.6721 0.2009 -14.2796 0.0000 0.2378
## meanses 3.9711 170.3710 0.3324 11.9480 0.0000 0.3470
## zsize 0.4161 159.0309 0.1355 3.0702 0.0025 0.1420
## sector 2.1139 153.0911 0.2979 7.0966 0.0000 0.3068
## cses:zsize 0.2909 7023.0769 0.1044 2.7871 0.0053 0.1254
## Robust t p-value
## (Intercept) 33.2755 0.0000
## cses 5.6391 0.0000
## female -7.1993 0.0000
## minority -12.0624 0.0000
## meanses 11.4437 0.0000
## zsize 2.9296 0.0039
## sector 6.8907 0.0000
## cses:zsize 2.3205 0.0203
r3 <- robust(model3, hsb$mathach, hsb$id, "satterthwaite")
round(r3, digits=4)
## Fixed Est. satterthwaite Model se. Model t p-value Robust se
## (Intercept) 12.0180 318.0900 0.4327 27.7758 0.0000 0.4209
## cses 2.2618 151.4378 0.3121 7.2461 0.0000 0.3201
## female -0.3582 6761.1952 0.4217 -0.8494 0.3957 0.4061
## minority -4.0871 3074.1195 0.6183 -6.6105 0.0000 0.6494
## meanses 4.2344 404.7830 0.4629 9.1474 0.0000 0.4938
## zsize 0.6932 359.4425 0.1758 3.9429 0.0001 0.1785
## sector 1.7389 308.0775 0.3807 4.5670 0.0000 0.3905
## cses:zsize 0.0321 157.9370 0.1229 0.2616 0.7940 0.1349
## cses:sector -0.9985 151.9053 0.2538 -3.9349 0.0001 0.2496
## female:meanses -0.0543 2820.9501 0.4433 -0.1226 0.9025 0.4240
## female:zsize -0.4161 6582.4290 0.1677 -2.4815 0.0131 0.1511
## female:sector -0.2368 2664.6844 0.3924 -0.6034 0.5463 0.4078
## minority:meanses -0.8046 2167.1544 0.4918 -1.6360 0.1020 0.4895
## minority:zsize 0.0696 2386.4222 0.2213 0.3146 0.7531 0.2348
## minority:sector 2.0697 3273.0818 0.4949 4.1824 0.0000 0.5400
## Robust t p-value
## (Intercept) 28.5541 0.0000
## cses 7.0659 0.0000
## female -0.8820 0.3778
## minority -6.2938 0.0000
## meanses 8.5759 0.0000
## zsize 3.8829 0.0001
## sector 4.4526 0.0000
## cses:zsize 0.2383 0.8119
## cses:sector -4.0005 0.0001
## female:meanses -0.1281 0.8981
## female:zsize -2.7540 0.0059
## female:sector -0.5807 0.5615
## minority:meanses -1.6436 0.1004
## minority:zsize 0.2965 0.7669
## minority:sector 3.8329 0.0001
And one more example
r4 <- robust(model4, hsb$mathach, hsb$id, "satterthwaite")
round(r4, digits=4)
The is based off of results in Delattre, M., Lavielle, M., Poursat, M.A. (2014). A note on BIC in mixed-effects models. Electronic Journal of Statistics, 8, 456–475. doi: 10.1214/140EJS890, url= https://projecteuclid.org/download/pdfview_1/euclid.ejs/1399035845
This functions gives an approximation for BIC for HLM that used 2 sample sizes: The number of groups/clusters and the total number of observations. The derivation used balanced design
This should be a good approximation for large number of groups andlarge group sizes
I transformed their result such that smaller is better. The function computes 2 varients: one based on total sample size and the other used the harmonical mean of n_j time number of clusters.
Syntax:
bic.hlm(model.in,cluster.id)
where
A bunch of examples,
bic.hlm(model1, hsb$id)
## n groups total obs haroninc.mean.nj n random n fixed deviance bic.new
## id 160 7185 41.05874 2 7 46289.34 46361.65
## bic.harm bic.ngrps bic.ntot aic
## id 46361.02 46335.02 46369.26 46307.34
bic.hlm(model2, hsb$id)
## n groups total obs haroninc.mean.nj n random n fixed deviance bic.new
## id 160 7185 41.05874 2 8 46281.58 46362.76
## bic.harm bic.ngrps bic.ntot aic
## id 46362.05 46332.33 46370.37 46301.58
bic.hlm(model3, hsb$id)
## n groups total obs haroninc.mean.nj n random n fixed deviance bic.new
## id 160 7185 41.05874 4 15 46230.7 46384.2
## bic.harm bic.ngrps bic.ntot aic
## id 46382.85 46327.13 46399.42 46268.7
bic.hlm(model4, hsb$id)
## n groups total obs haroninc.mean.nj n random n fixed deviance bic.new
## id 160 7185 41.05874 11 15 46220.84 46409.87
## bic.harm bic.ngrps bic.ntot aic
## id 46408.52 46352.8 46451.72 46272.84
bic.hlm(model5, hsb$id)
## n groups total obs haroninc.mean.nj n random n fixed deviance bic.new
## id 160 7185 41.05874 7 15 46224.49 46393.21
## bic.harm bic.ngrps bic.ntot aic
## id 46391.86 46336.14 46419.84 46268.49
Thie function computes Rsq values that equal the proportional reduction of prediction errorvariance accounted for the level 1 and Level 2 Models.
In population these will always be positive. If they are negative, this is indicates model miss-specfication. If they get smaller as add variables, this also could be model miss-specification, in particular if implicit restrictions are places on parameters.
There were proposed by Snijders, TAB, Bosker, RJ (1994). “Modeled variance in two-level models.” Sociological Methods & Research, 22(3), 342-363.
Syntax:
hlmRsq function(dataset,model1)
where
dataset is the data
model is the model for which you want R1sq and R2sq #
Requirements:
variable identifying group/cluster should be named "id"
first (set of fixed) variable(s) in formula are the random ones
REQUIRES that group/cluster/etc id is actually called “id”
id <- hsb$id
Some random intercept models:
model.b <- lmer(mathach ~ 1 + cses + (1 | id),data=hsb,REML=FALSE)
hlmRsq(hsb,model.b)
## harmonic.mean R1sq R2sq
## [1,] 41.05874 0.04491355 0.005671609
model.d <- lmer(mathach ~ 1 + cses + minority + female + (1 | id),data=hsb,REML=FALSE)
hlmRsq(hsb,model.d)
## harmonic.mean R1sq R2sq
## [1,] 41.05874 0.1176997 0.2539772
Random intercept and slope models
summary(model.bis<- lmer(mathach ~ 1 + cses + meanses + (1 + cses| id),data=hsb,REML=FALSE))
## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
## method [lmerModLmerTest]
## Formula: mathach ~ 1 + cses + meanses + (1 + cses | id)
## Data: hsb
##
## AIC BIC logLik deviance df.resid
## 46567.2 46615.3 -23276.6 46553.2 7178
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.1778 -0.7260 0.0158 0.7565 2.9403
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 2.6446 1.6262
## cses 0.6701 0.8186 -0.19
## Residual 36.7132 6.0591
## Number of obs: 7185, groups: id, 160
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 12.6588 0.1482 155.6389 85.41 <2e-16 ***
## cses 2.1912 0.1276 156.3840 17.17 <2e-16 ***
## meanses 5.8959 0.3577 154.8624 16.48 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) cses
## cses -0.083
## meanses -0.004 0.004
hlmRsq(hsb,model.bis)
## harmonic.mean R1sq R2sq
## [1,] 41.05874 0.1697779 0.6290588
summary(model.ris2 <- lmer(mathach ~ 1 + cses + meanses + minority + female + minority + pracad + sector +(1+cses + minority | id),data=hsb,REML=FALSE))
## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
## method [lmerModLmerTest]
## Formula: mathach ~ 1 + cses + meanses + minority + female + minority +
## pracad + sector + (1 + cses + minority | id)
## Data: hsb
##
## AIC BIC logLik deviance df.resid
## 46292.5 46388.8 -23132.3 46264.5 7171
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.2765 -0.7210 0.0317 0.7604 2.9219
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 1.3833 1.1761
## cses 0.4069 0.6379 0.37
## minority 1.4721 1.2133 -0.04 -0.92
## Residual 35.4840 5.9568
## Number of obs: 7185, groups: id, 160
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 11.9953 0.3782 177.0312 31.718 < 2e-16 ***
## cses 1.8927 0.1203 152.3099 15.737 < 2e-16 ***
## meanses 3.1944 0.4124 166.6987 7.745 8.82e-13 ***
## minority -2.9835 0.2318 130.6876 -12.869 < 2e-16 ***
## female -1.2296 0.1582 5170.4769 -7.775 9.06e-15 ***
## pracad 3.2474 0.8136 158.0909 3.991 0.0001 ***
## sector 0.8111 0.3338 149.7558 2.430 0.0163 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) cses meanss minrty female pracad
## cses 0.021
## meanses 0.559 0.024
## minority -0.050 -0.055 0.210
## female -0.248 0.046 0.017 0.013
## pracad -0.865 -0.007 -0.592 -0.058 0.037
## sector 0.309 -0.001 0.133 -0.036 -0.036 -0.623
hlmRsq(hsb,model.ris2)
## harmonic.mean R1sq R2sq
## [1,] 41.05874 0.2192013 0.7387274
This is used for testing whether H_o: tau_0^2 = tau_01 =….= tau_0N =0
e.g.,
(p <- .5*(pchisq(5.40, 1, lower.tail=FALSE) + pchisq(5.40, 2, lower.tail=FALSE)))
## [1] 0.04367113
(p <- .5*(pchisq(1.55, 3, lower.tail=FALSE) + pchisq(1.55, 4, lower.tail=FALSE)) )
## [1] 0.7442643