API Reference
Functions
OLSPlots.diagnostic_plots — Functiondiagnostic_plots(model; which=[1,2,3,5], r_style=true)Generate standard diagnostic plots for an ordinary least squares (OLS) regression model, styled to match R's default diagnostic plot presentation.
Arguments
model: A fitted regression model from GLM.jl (e.g., created withlm())which: Vector of integers specifying which plots to show (default: [1,2,3,5], matching R's default)- Residuals vs Fitted Values
- Normal Q-Q Plot
- Scale-Location Plot
- Cook's Distance Plot
- Residuals vs Leverage (with Cook's distance contours)
- Cook's Distance vs Leverage h/(1-h)
r_style: Boolean, if true uses R-like styling (default: true)
Returns
- A CairoMakie Figure object containing the diagnostic plots
Examples
```julia using GLM, DataFrames, OLSDiagnosticPlots
Create sample data
df = DataFrame(x1 = rand(100), x2 = rand(100), y = rand(100) .+ 2 .* rand(100))
Fit an OLS model
ols_model = lm(@formula(y ~ x1 + x2), df)
Generate default diagnostic plots (1,2,3,5) - same as R's default
fig = diagnosticplots(olsmodel)
Show the first four plots (1,2,3,4) - without Residuals vs Leverage
fig = diagnosticplots(olsmodel, which=[1,2,3,4])
Or generate all six plots
fig = diagnosticplots(olsmodel, which=1:6)
Implementation Details
OLSPlots.jl makes use of the following components in its implementation:
- GLM.jl - For working with linear models
- CairoMakie.jl - For creating the visualizations
- Distributions.jl - For statistical distributions and quantiles
- Loess.jl - For smoothing curves in the diagnostic plots
Diagnostic Plot Types
The package can generate six different diagnostic plots:
Residuals vs Fitted Values: Shows if residuals have non-linear patterns, which would indicate a non-linear relationship that is not captured by the model.
Normal Q-Q Plot: Plots the distribution of standardized residuals against a normal distribution to check if residuals are normally distributed.
Scale-Location Plot: Shows if residuals are spread equally along the ranges of predictors, used to check the homoscedasticity assumption.
Cook's Distance Plot: Identifies influential observations that might have a large effect on the regression.
Residuals vs Leverage: Shows the relationship between standardized residuals and leverage, with Cook's distance contours, to help identify influential observations.
Cook's Distance vs Leverage h/(1-h): An alternative view of Cook's distance against leverage transformed by h/(1-h).
Calculation Methods
The package calculates the following metrics:
- Leverage values: Diagonal elements of the hat matrix H = X(X'X)⁻¹X'
- Standardized residuals: Residuals divided by their standard deviation
- Cook's distances: Measures of how much the regression would change if an observation were removed
These metrics are used to create the diagnostic plots that help assess model fit, detect outliers, and identify influential observations.