API Reference

Functions

OLSPlots.diagnostic_plotsFunction
diagnostic_plots(model; which=[1,2,3,5], r_style=true)

Generate standard diagnostic plots for an ordinary least squares (OLS) regression model, styled to match R's default diagnostic plot presentation.

Arguments

  • model: A fitted regression model from GLM.jl (e.g., created with lm())
  • which: Vector of integers specifying which plots to show (default: [1,2,3,5], matching R's default)
    1. Residuals vs Fitted Values
    2. Normal Q-Q Plot
    3. Scale-Location Plot
    4. Cook's Distance Plot
    5. Residuals vs Leverage (with Cook's distance contours)
    6. Cook's Distance vs Leverage h/(1-h)
  • r_style: Boolean, if true uses R-like styling (default: true)

Returns

  • A CairoMakie Figure object containing the diagnostic plots

Examples

```julia using GLM, DataFrames, OLSDiagnosticPlots

Create sample data

df = DataFrame(x1 = rand(100), x2 = rand(100), y = rand(100) .+ 2 .* rand(100))

Fit an OLS model

ols_model = lm(@formula(y ~ x1 + x2), df)

Generate default diagnostic plots (1,2,3,5) - same as R's default

fig = diagnosticplots(olsmodel)

Show the first four plots (1,2,3,4) - without Residuals vs Leverage

fig = diagnosticplots(olsmodel, which=[1,2,3,4])

Or generate all six plots

fig = diagnosticplots(olsmodel, which=1:6)

source

Implementation Details

OLSPlots.jl makes use of the following components in its implementation:

  • GLM.jl - For working with linear models
  • CairoMakie.jl - For creating the visualizations
  • Distributions.jl - For statistical distributions and quantiles
  • Loess.jl - For smoothing curves in the diagnostic plots

Diagnostic Plot Types

The package can generate six different diagnostic plots:

  1. Residuals vs Fitted Values: Shows if residuals have non-linear patterns, which would indicate a non-linear relationship that is not captured by the model.

  2. Normal Q-Q Plot: Plots the distribution of standardized residuals against a normal distribution to check if residuals are normally distributed.

  3. Scale-Location Plot: Shows if residuals are spread equally along the ranges of predictors, used to check the homoscedasticity assumption.

  4. Cook's Distance Plot: Identifies influential observations that might have a large effect on the regression.

  5. Residuals vs Leverage: Shows the relationship between standardized residuals and leverage, with Cook's distance contours, to help identify influential observations.

  6. Cook's Distance vs Leverage h/(1-h): An alternative view of Cook's distance against leverage transformed by h/(1-h).

Calculation Methods

The package calculates the following metrics:

  • Leverage values: Diagonal elements of the hat matrix H = X(X'X)⁻¹X'
  • Standardized residuals: Residuals divided by their standard deviation
  • Cook's distances: Measures of how much the regression would change if an observation were removed

These metrics are used to create the diagnostic plots that help assess model fit, detect outliers, and identify influential observations.