Safe predictions from a multiple linear model object

# S3 method for mlm
safe_predict(object, new_data, type = c("response"), ...)

Arguments

object

An mlm object returned from a call to stats::lm().

new_data

Required. A data frame or matrix containing the necessary predictors.

type

What kind of predictions to return. Options are:

  • "response" (default): Standard predictions from multiple regression.

...

Unused. safe_predict() checks that all arguments in ... are evaluated via the ellipsis package. The idea is to prevent silent errors when arguments are mispelled. This feature is experimental and feedback is welcome.

Value

A tibble::tibble() with one row for each row of new_data. Predictions for observations with missing data will be NA. Returned tibble has different columns depending on type:

  • "response":

    • univariate outcome: .pred (numeric)

    • multivariate outcomes: .pred_{outcome name} (numeric) for each outcome

  • "class": .pred_class (factor)

  • "prob": .pred_{level} columns (numerics between 0 and 1)

  • "link": .pred (numeric)

  • "conf_int": .pred, .pred_lower, .pred_upper (all numeric)

  • "pred_int": .pred, .pred_lower, .pred_upper (all numeric)

If you request standard errors with std_error = TRUE, an additional column .std_error.

For interval predictions, the tibble has additional attributes level and interval. The level is the same as the level argument and is between 0 and 1. interval is either "confidence" or "prediction". Some models may also set a method attribute to detail the method used to calculate the intervals.

Estimating uncertainty

`stats::predict.mlm()`` provides neither confidence nor prediction intervals, although there is not theoretical issue with calculating these.

At some point in the future we may implement these intervals within safepredict. If you are interested in this, you can move intervals for mlm objects up the priority list by opening an issue on Github.

Examples

fit <- lm(cbind(hp, mpg) ~ ., mtcars) safe_predict(fit, mtcars)
#> # A tibble: 32 x 2 #> .pred_.pred_hp .pred_.pred_mpg #> <dbl> <dbl> #> 1 147. 21.8 #> 2 140. 21.5 #> 3 71.9 26.7 #> 4 127. 20.9 #> 5 186. 17.5 #> 6 107. 20.3 #> 7 227. 14.8 #> 8 81.9 22.1 #> 9 68.2 25.0 #> 10 148. 18.2 #> # ... with 22 more rows
mt2 <- mtcars diag(mt2) <- NA safe_predict(fit, mt2)
#> # A tibble: 32 x 2 #> .pred_.pred_hp .pred_.pred_mpg #> <dbl> <dbl> #> 1 147. 21.8 #> 2 NA NA #> 3 NA NA #> 4 127. 20.9 #> 5 NA NA #> 6 NA NA #> 7 NA NA #> 8 NA NA #> 9 NA NA #> 10 NA NA #> # ... with 22 more rows