LiMMCov

Welcome to the Linear Mixed Model research tool

This tool guides you through the process of selecting residual covariance structures in linear mixed models. It uses time series insights for complex covariance structure. The LiMMCov tool offers visualisations to assess the residuals of fixed effects models and offer suggestions for the correct structure based on the data. The application is based on the Shiny package and the user-friendly interface enables the user to navigate the workflow. A report can be downloaded with the results.

This app was developed by the Support for Quantitative and Qualitative Research (SQUARE) to provide the research community with complimentary support in statistics. The developers are part of the Biostatistics and Medical Informatics research group (BISI) at the Vrije Universiteit Brussel (VUB).

Terms of use

LiMMCov is not designed to be exhaustive and is thus appropriate for the use cases described in the app.

Uploaded data and outputs from analyses won't be kept on our servers. Therefore, you must refrain from uploading any sensitive information because the research tool is supplied WITHOUT ANY WARRANTY. Instead, you can download the reports and any modified data. If you submit any data to this application, you are solely responsible for its confidentiality, availability, security, loss, abuse, and misappropriation.

Feedback

We would love to hear your thoughts, suggestions, concerns or problems you encountered while using LiMMCov so that we can improve. To do this, kindly evaluate the web-application via this link. For other questions and comments please email Perseverence.Savieri@vub.be

Contribute your dataset

Have a real-world dataset you'd like to share with the LiMMCov community? We welcome user-submitted datasets that are publicly available (e.g., openly licensed), as they help us improve LiMMCov by covering a wider range of study designs and boundary cases. If you are interested in contributing your dataset, please contact the development team via email so we can discuss adding it to our online repository.

Citation

If you use LiMMCov for your research, teaching, or presentations, please cite the following publication:

Savieri, P., Stas, L., & Barbé, K. (2025). LiMMCov: An interactive research tool for efficiently selecting covariance structures in linear mixed models using insights from time series analysis. PLoS One, 20(6), e0325834.
https://doi.org/10.1371/journal.pone.0325834

Show BibTeX citation

@article{savieri2025limmcov,
  title={LiMMCov: An interactive research tool for efficiently selecting covariance structures in linear mixed models using insights from time series analysis},
  author={Savieri, Perseverence and Stas, Lara and Barb{\'e}, Kurt},
  journal={PLoS One},
  volume={20},
  number={6},
  pages={e0325834},
  year={2025},
  doi={https://doi.org/10.1371/journal.pone.0325834},
  publisher={Public Library of Science San Francisco, CA USA}
}

What is a General Linear Model?

A General Linear Model (GLM) explains the relationship between a continuous dependent variable and one or more independent variables through a linear equation. GLMs assume a linear relationship between the dependent variable and predictors. If interactions are relevant, you can include them to assess whether the effect of one predictor depends on another.

Why use GLMs for covariance analysis?

Fitting a GLM helps model residuals to uncover the covariance structure in longitudinal data. By specifying the outcome variable and mean structure (fixed effects), you can explore relationships among variables.

Understanding covariance structures

In longitudinal data analysis, covariance structures are used to account for correlations between repeated measurements. The choice of covariance structure directly affects the interpretation of residuals, providing insights into the dependency between observations across time points or spatial dimensions. Residual plots, alongside the partial autocorrelation function (PACF), allow users to visualise the underlying covariance structure that adequately captures these dependencies.

In this tab, we explore how different covariance structures influence the pattern of residuals and their autocorrelations. For instance, a compound symmetry (CS) structure suggests equal correlation across all observations, leading to a flat residual pattern. Autoregressive structures like AR1 or AR2, on the other hand, reflect time-dependent correlations, with residuals decaying or oscillating based on the lag between observations. The PACF helps further clarify these patterns, highlighting how residuals behave at various time lags, which can guide the selection of a suitable covariance structure for your data.

Show ideal plots of correlation structures

Ideal residual plots of common correlation structures

No specific pattern. The correlation values vary widely across lags, and there is no clear trend. CAUTION: not of practical use because it is hard to estimate.

A flat line at the constant correlation (rho). In a compound symmetry structure, the correlation is the same for all pairs of variables, regardless of the lag.

A declining line. As the lag increases, the correlation decreases exponentially. The correlation drops more rapidly for larger lags.

Oscillations in the average autocorrelation reflect the interplay between lag terms, with values alternating around zero, indicating no systematic bias.

Ideal PACF plots of common correlation structures

No clear pattern or systematic decay in the partial autocorrelations. Significant spikes may appear at various lags, but without a consistent or interpretable pattern. The partial autocorrelations may fluctuate randomly around zero, with some lags potentially exceeding the significance bounds.

A single significant spike at lag 1, indicating a constant correlation across all time points followed by a rapid decay towards 0 for the subsequent lags.

A significant spike at lag 1, with subsequent lags showing values close to zero, reflecting the single-level decay of correlation over time.

For an AR(p) process, the PACF will have significant spikes up to lag p, after which the values drop to zero, highlighting the true order of the autoregressive model. Here, there are 2 signicficant spikes indicating an AR(2) structure.

What are Linear Mixed Models?

Linear Mixed Models (LMMs) are an extension of linear regression that can model longitudinal data by accounting for the correlation in repeated measurements.

The model is often represented as: \(y_i = X_i β + Z_i b_i + ε_i\), where \(b_i \sim N(0, D)\), \(ε_i \sim N(0, Σ_i)\)

Here, \(X_i\) and \(Z_i\) are the fixed and random design matrices, respectively. The vector \(β\) represents the fixed effects, which describe the mean response in the population, while \(b_i\) represents the random effects, capturing individual-specific deviations. The term \(ε_i\) refers to the random error, representing unexplained variability in the data. \(D\) is the covariance matrix for the random effects, while \(Σ_i\) specifies the covariance structure for the residuals.

The LMM accommodates both fixed and random effects to model the within-subject and between-subject variability. The model can be formulated hierachically or marginally to capture the correlation. In the hierachical formulation, given the random effects, the measurements of each subject are independent (conditional independence assumption), meaning the more random effects we include the more flexibly we capture the correlations. In the marginal formulation, the measurements of each subject are correlated and this correlation is estimated by the marginal covariance matrix.

This app focuses on the marginal formulation.

Variance Components

In the marginal formulation, the covariance matrix of the outcome variable \(y_i\) is given by: \(V_i = Z_i D Z_i' + Σ_i\)

We need an appropriate choice for \(V_i\) in order to appropriately describe the correlations between the repeated measurements.

Application

In practice, valid and efficient inferences for the fixed effects require an appropriately specified marginal covariance structure. This ensures that the correlation between repeated measurements is accounted for, reducing bias in the estimation of fixed effects.

This app allows users to focus on modeling the marginal covariance structure, ensuring robust model fitting. The choice of covariance structure influences both the variance component estimates and the accuracy of hypothesis testing for fixed effects.

This dynamic report includes the following sections:

GLM Analysis

Summary and plots from the General Linear Model (GLM) analysis.

Covariance Analysis

Details of the covariance structures used and their impact on the model.

LMM Analysis

Results and plots from the Linear Mixed Model (LMM) analysis.

Documentation

Steps on how to use the LiMMCov app

Step 1: Load data

Begin by selecting an example dataset or uploading your data file. You can do this in the Data Input sidebar tab. Make sure to choose the correct file extension to ensure proper data loading. Make sure your dataset is in long format (i.e., one row per subject per time point). Wide-format data must be reshaped prior to use.

If you need to change the data type of a variable, select the variable in question and then choose the New data type. Apply changes by clicking the Change data type button.

To verify that the data has been loaded correctly, check the View Data tab, which displays a preview (10 rows by default). You can adjust this setting through the Show entries dropdown. A summary of your dataset, including descriptive statistics and variable distributions, can be found in the Data Summary tab.

Note on Missing Data: LiMMCov requires complete datasets, so we recommend handling missing values before uploading. While multiple imputation is one possible strategy, other approaches (e.g., full information maximum likelihood, inverse probability weighting) may be more appropriate depending on factors like sample size or missing not at random (MNAR) data. Because LiMMCov focuses on choosing the covariance structure, data preprocessing and imputation lie outside the app’s scope. We encourage users to consult the references below for best practices in addressing missingness, ensuring the validity of subsequent analyses:

Molenberghs G, Fitzmaurice G, Kenward MG, Tsiatis A, Verbeke G. Handbook of missing data methodology. CRC Press; 2014.
Nooraee N, Molenberghs G, Ormel J, Van den Heuvel ER. Strategies for handling missing data in longitudinal studies with questionnaires. J Stat Comput Simul. 2018;88(17):3415–36. https://doi.org/10.1080/00949655.2018.1520854
Ji L, Chow SM, Schermerhorn AC, Jacobson NC, Cummings EM. Handling Missing Data in the Modeling of Intensive Longitudinal Data. Struct Equ Model. 2018;25(5):715–36. https://doi.org/10.1080/10705511.2017.1417046
Heymans MW, Twisk JWR. Handling missing data in clinical research. J Clin Epidemiol. 2022;151:185–8. https://doi.org/10.1016/j.jclinepi.2022.08.01

Step 2: Fit linear model (GLM)

Next, specify the variables for the model. Start by selecting the subject/id, outcome variable, and the desired mean structure in the General Linear Model (GLM) tab.

After specifying the model, click the Run GLM button to fit the model. You can view the model results under the Model Summary section.

Step 3: Analyse the covariance

Use the Covariance Analysis tab to explore the underlying covariance structure of the residuals. It provides insights into the correlation structure of residuals from the GLM. Residual plots help identify specific patterns related to different correlation structures, while PACF plots provide further insight.

Interpreting Residual Plots and PACF:

Use this quick reference guide to determine the appropriate covariance structure based on your residual plots and PACF patterns:

Step 1: Examine the residual plot. If there is no correlation (flat line across all lags), use Compound Symmetry (CS).
Step 2: If residuals oscillate, use AR(2).
Step 3: If there is a gradual decay in correlation, check the PACF.

PACF: Single spike at lag 1 → Use AR(1).
PACF: Spikes at lags 1 and 2 → Use AR(2).
PACF: Multiple spikes up to lag p → Use AR(p).

The following covariance structures are available in the nlme package, each with different applications:

corAR1 (Autoregressive order 1):

Models a correlation where observations closer in time/space are more correlated.
The correlation decays exponentially as time/distance increases.
In residual plots, covariances decay linearly with increasing lags.
Use for equally spaced measurements.

corAR(p) - Autoregressive order p:

Allows for complex autoregressive patterns beyond AR1.
Graphical patterns depend on specific AR orders.
Suitable for equally spaced time points.

corCAR1 (Continuous AR1):

Similar to AR1, but for continuous-time covariates.
Correlation decays exponentially with increasing time/distance.
Useful for unequally spaced time points.
Ideal for continuous-time and unbalanced data.

corCompSymm (Compound Symmetry):

Assumes constant correlation across all observation pairs.
Equal variance parameters (homogeneous).
Covariances at all lags in residual plots will be similar.
For equally spaced time points and exchangeable structures.

Spatial Correlation (corExp, corGaus, etc.):

Models spatial correlation based on distance between observations.
Useful for irregularly spaced observations in 2D/3D space.
Non-linear covariance patterns based on spatial locations.
Ideal for continuous-time and unbalanced data.

corSymm (General Correlation):

Unstructured correlation matrix.
No pattern assumptions.
No specific graphical pattern.
Useful for complex, irregular correlation structures but requires more parameters.

Step 4: Fit linear mixed model (LMM)

To fit a linear mixed model, select the subject/id, outcome, time variables, and fixed effects. If your model includes interaction terms, define them using the Select interaction term(s) field.

You can select the observed covariance structure from the dropdown. For higher AR terms, use the AR(p) option and specify the appropriate order.

Fit the model by clicking the Run LMM button. The results will appear under the Model Summary section.

To compare different models, refer to the AIC table. The model with the lowest AIC is generally considered the best-fitting model. If convergence issues occur, they will be flagged in the AIC column.

Step 5: Generate a report

After your analysis, you can generate a report by selecting the desired format (HTML, PDF, or Word). This report includes a summary of the linear model, covariance analysis, and the best-fitting linear mixed model, along with plots.

Note

At any time, you can refresh the session and start over by clicking the Power button.

Developers

Perseverence Savieri is a doctoral researcher in the Biostatistics and Medical Informatics research group (BISI) at the Vrije Universiteit Brussel medical campus Jette. He is also a principal statistical consultant for the humanities and social sciences at campus Etterbeek through the Support for Quantitative and Qualitative Research (SQUARE) core facility. Here, he offers statistical and methodological quantitative support in the form of consultations, statistical coaching, data analyses and workshops.