Compute multilevel linear models for complex cluster designs with multiple imputed variables based on the Jackknife (JK1, JK2) procedure. Conceptually, the function combines replication methods and methods for multiple imputed data. Technically, this is a wrapper for the BIFIE.twolevelreg function of the BIFIEsurvey package. repLmer only adds functionality for trend estimation. Please note that the function is not suitable for logistic logit/probit models.

repLmer(datL, ID, wgt = NULL, L1wgt=NULL, L2wgt=NULL, type = c("JK2", "JK1"),
            PSU = NULL, repInd = NULL, jkfac = NULL, rho = NULL, imp=NULL,
            group = NULL, trend = NULL, dependent, formula.fixed, formula.random,
            doCheck = TRUE, na.rm = FALSE, clusters, verbose = TRUE)

Arguments

datL

Data frame in the long format (i.e. each line represents one ID unit in one imputation of one nest) containing all variables for analysis.

ID

Variable name or column number of student identifier (ID) variable. ID variable must not contain any missing values.

wgt

Optional: Variable name or column number of case weighting variable. If no weighting variable is specified, all cases will be equally weighted.

L1wgt

Name of Level 1 weight variable. This is optional. If it is not provided, L1wgt is calculated from the total weight (i.e., wgt) and L2wgt.

L2wgt

Name of Level 2 weight variable

type

Defines the replication method for cluster replicates which is to be applied. Depending on type, additional arguments must be specified (e.g., PSU and/or repInd or repWgt).

PSU

Variable name or column number of variable indicating the primary sampling unit (PSU). When a jackknife procedure is applied, the PSU is the jackknife zone variable. If NULL, no cluster structure is assumed and standard errors are computed according to a random sample.

repInd

Variable name or column number of variable indicating replicate ID. In a jackknife procedure, this is the jackknife replicate variable. If NULL, no cluster structure is assumed and standard errors are computed according to a random sample.

jkfac

Argument is passed to BIFIE.data.jack and specifies the factor for multiplying jackknife replicate weights.

rho

Fay factor for statistical inference. The argument is passed to the fayfac argument of the BIFIE.data.jack function from the BIFIEsurvey package. See the corresponding help page for further details. For convenience, if rho = NULL (the default) and type = "JK1", BIFIE.data.jack is called with jktype="JK_GROUP" and fayfac = rho, where \(\rho = (N_{cluster} - 1) \times N_{cluster}^{-1}\)

imp

Name or column number of the imputation variable.

group

Optional: column number or name of one grouping variable. Note: in contrast to repMean, only one grouping variable can be specified.

trend

Optional: name or column number of the trend variable which contains the measurement time of the survey. repLmer computes differences for all pairwise contrasts defined by trend variable levels. or three measurement occasions, i.e. 2010, 2015, and 2020, contrasts (i.e. trends) are computed for 2010 vs. 2015, 2010 vs. 2020, and 2015 vs. 2020.

dependent

Name or column number of the dependent variable

formula.fixed

An R formula for fixed effects

formula.random

An R formula for random effects

doCheck

Logical: Check the data for consistency before analysis? If TRUE groups with insufficient data are excluded from analysis to prevent subsequent functions from crashing.

na.rm

Logical: Should cases with missing values be dropped?

clusters

Variable name or column number of cluster variable.

verbose

Logical: Show analysis information on console?

Value

A list of data frames in the long format. The output can be summarized using the report function. The first element of the list is a list with either one (no trend analyses) or two (trend analyses) data frames with at least six columns each. For each subpopulation denoted by the groups statement, each dependent variable, each parameter and each coefficient the corresponding value is given.

group

Denotes the group an analysis belongs to. If no groups were specified and/or analysis for the whole sample were requested, the value of ‘group’ is ‘wholeGroup’.

depVar

Denotes the name of the dependent variable in the analysis.

modus

Denotes the mode of the analysis. For example, if a JK2 analysis without sampling weights was conducted, ‘modus’ takes the value ‘jk2.unweighted’. If a analysis without any replicates but with sampling weights was conducted, ‘modus’ takes the value ‘weighted’.

parameter

Denotes the parameter of the regression model for which the corresponding value is given further. Amongst others, the ‘parameter’ column takes the values ‘(Intercept)’ and ‘gendermale’ if ‘gender’ was the dependent variable, for instance. See example 1 for further details.

coefficient

Denotes the coefficient for which the corresponding value is given further. Takes the values ‘est’ (estimate) and ‘se’ (standard error of the estimate).

value

The value of the parameter estimate in the corresponding group.

If groups were specified, further columns which are denoted by the group names are added to the data frame.

Examples

### load example data (long format)
data(lsa)
### use only the first nest, use only reading
btRead <- subset(lsa, nest==1 & domain=="reading")

# \donttest{
### random intercept model with groups
mod1 <- repLmer(datL = btRead, ID = "idstud", wgt = "wgt", L1wgt="L1wgt", L2wgt="L2wgt",
        type = "jk2", PSU = "jkzone", repInd = "jkrep", imp = "imp",trend="year",
        group="country", dependent="score", formula.fixed = ~as.factor(sex)+mig,
        formula.random=~1, clusters="idclass")
#> Logical variable 'mig' will be transformed into numeric.
#> 
#> Trend group: '2010'
#> 1 analyse(s) overall according to: 'group.splits = 1'.
#> Assume unnested structure with 3 imputations.
#> 
#> `BIFIEsurvey::BIFIE.data.jack`(data = "datL", wgt = "wgt", jktype = "JK_TIMSS", 
#>     jkzone = "jkzone", jkrep = "jkrep", jkfac = NULL, fayfac = NULL, 
#>     cdata = FALSE)
#> MI data with 3 datasets || 92 replication weights with fayfac=1  || 3079 cases and 14 variables 
#>  
#> Imputation 1 | Group 1 |---------- 
#> Imputation 1 | Group 2 |---------- 
#> Imputation 1 | Group 3 |---------- 
#> Imputation 2 | Group 1 |---------- 
#> Imputation 2 | Group 2 |---------- 
#> Imputation 2 | Group 3 |---------- 
#> Imputation 3 | Group 1 |---------- 
#> Imputation 3 | Group 2 |---------- 
#> Imputation 3 | Group 3 |---------- 
#> 
#> 
#> Trend group: '2015'
#> 1 analyse(s) overall according to: 'group.splits = 1'.
#> Assume unnested structure with 3 imputations.
#> 
#> `BIFIEsurvey::BIFIE.data.jack`(data = "datL", wgt = "wgt", jktype = "JK_TIMSS", 
#>     jkzone = "jkzone", jkrep = "jkrep", jkfac = NULL, fayfac = NULL, 
#>     cdata = FALSE)
#> MI data with 3 datasets || 73 replication weights with fayfac=1  || 2928 cases and 14 variables 
#>  
#> Imputation 1 | Group 1 |-------- 
#> Imputation 1 | Group 2 |-------- 
#> Imputation 1 | Group 3 |-------- 
#> Imputation 2 | Group 1 |-------- 
#> Imputation 2 | Group 2 |-------- 
#> Imputation 2 | Group 3 |-------- 
#> Imputation 3 | Group 1 |-------- 
#> Imputation 3 | Group 2 |-------- 
#> Imputation 3 | Group 3 |-------- 
#> 
#> Note: No linking error was defined. Linking error will be defaulted to '0'.
res1 <- report(mod1)

### random slope without groups and without trend
mod2 <- repLmer(datL = subset(btRead, country=="countryA" & year== 2010),
        ID = "idstud", wgt = "wgt", L1wgt="L1wgt", L2wgt="L2wgt", type = "jk2",
        PSU = "jkzone", repInd = "jkrep", imp = "imp", dependent="score",
        formula.fixed = ~as.factor(sex)*mig, formula.random=~mig, clusters="idclass")
#> Logical variable 'mig' will be transformed into numeric.
#> 1 analyse(s) overall according to: 'group.splits = 0'.
#> Assume unnested structure with 3 imputations.
#> 
#> `BIFIEsurvey::BIFIE.data.jack`(data = "datL", wgt = "wgt", jktype = "JK_TIMSS", 
#>     jkzone = "jkzone", jkrep = "jkrep", jkfac = NULL, fayfac = NULL, 
#>     cdata = FALSE)
#> MI data with 3 datasets || 32 replication weights with fayfac=1  || 1034 cases and 14 variables 
#>  
#> Imputation 1 | Group 1 |--- 
#> Imputation 2 | Group 1 |--- 
#> Imputation 3 | Group 1 |--- 
#> 
res2 <- report(mod2)
# }