An Epidemiological Guide to Interrupted Time Series Analysis

Author

Fan Xiong

Published

February 16, 2024

An Epidemiological Guide to Interrupted Time Series Analysis

This is a Quarto book that provides an overview for conducting an interrupted time series analysis.

Prerequisites

Readers interested in conducting an interrupted time series analysis should be familiar with time series analysis. The most important key concepts are autocorrelations, differencing, moving averages, and forecasting. Readers interested in learning these concepts more may find the Stat 510: Applied Time Series online notes from Pennsylvania State University of great interest. The tutorial example included in this book relies on the used of an automated ARIMA model selection and the methods described by1. This book relies on using R and the following R packages.

R Packages

Core Time Series Packages

  • tsibble: The foundation for structured time series data in the “Tidyverse” ecosystem. Provides a consistent data format and tools for wrangling time-based data.

  • feasts: Time series feature engineering. Extracts statistical characteristics (trend, seasonality, volatility, etc.) for analysis and modeling.

  • fable: Forecasting powerhouse. Offers an array of forecasting models compatible with ‘tsibble’ objects, making prediction tasks streamlined and powerful.

  • astsa: Functions and datasets focused on the practice of time series analysis. Good for specific methods and examples from a traditional statistics standpoint.

  • forecast: Established platform for classic time series forecasting (ARIMA, exponential smoothing). Useful for benchmarking or for its specialized auto-selection features.

  • zoo: Handles irregularly spaced time series data along with regular ones. Useful if your data has gaps or uneven intervals.

General Data Science

  • tidyverse: Core “meta-package” containing dplyr, ggplot2, and other foundational data science tools for a cohesive workflow.

  • tidymodels: Unified environment for a wide range of modeling techniques. Encompasses time series-specific capabilities and integrates well with the aforementioned packages. Includes yardstick for monitoring model performance and results.

Supportive Packages

  • lmtest: Primarily for diagnosing potential issues in linear regression, but sometimes relevant to time series regression models.

  • car: Focuses on regression diagnostics and visualization; has use cases when exploring specific aspects of time series models.

  • stats: Base R statistical functions. Necessary building blocks, often implicitly used or incorporated within other packages.

  • extraDistr: Offers access to additional statistical distributions for specialized modeling situations.

  • ggtext: Expands text formatting in ggplot2, allowing for rich labeling and annotations suitable for complex time series visuals.

  • patchwork& gridExtra: Simplify arranging multiple plots for creating dashboards or combining visual analysis results.

Acknowledgements

The author would like to acknowledge several individuals for their contributions, feedbacks, and mentoring.

Chris Delcher, University of Kentucky, for providing the model, interpretation, and analysis for the tutorial example in this book.

Crystal Yu, University of Washington, for providing assistance with the data visualization and latex math equations used in this book.

Other acknowledgements include the various articles consulted to learn more about interrupted time series analysis in the References section.