Get model performance metrics
evaluate(x, ...) # S3 method for predicted_df evaluate(x, na.rm = FALSE, ...) # S3 method for model_list evaluate(x, all_models = FALSE, ...)
x | Object to be evaluted |
---|---|
... | Not used |
na.rm | Logical. If FALSE (default) performance metrics will be NA if any rows are missing an outcome value. If TRUE, performance will be evaluted on the rows that have an outcome value. Only used when evaluating a prediction data frame. |
all_models | Logical. If FALSE (default), a numeric vector giving performance metrics for the best-performing model is returned. If TRUE, a data frame with performance metrics for all trained models is returned. Only used when evaluating a model_list. |
Either a numeric vector or a data frame depending on the value of all_models
This function gets model performance from a model_list object that
comes from machine_learn
, tune_models
,
flash_models
, or a data frame of predictions from
predict.model_list
. For the latter, the data passed to
predict.model_list
must contain observed outcomes. If you have
predictions and outcomes in a different format, see
evaluate_classification
or evaluate_regression
instead.
You may notice that evaluate(models)
and
evaluate(predict(models))
return slightly different performance
metrics, even though they are being calculated on the same (out-of-fold)
predictions. This is because metrics in training (returned from
evaluate(models)
) are calculated within each cross-validation fold
and then averaged, while metrics calculated on the prediction data frame
(evaluate(predict(models))
) are calculated once on all observations.
models <- machine_learn(pima_diabetes[1:40, ], patient_id, outcome = diabetes, models = c("XGB", "RF"), tune = FALSE, n_folds = 3)#>#>#> #>#> #>#>#>#>#> #>#># By default, evaluate returns performance of only the best model evaluate(models)#> AUPR AUROC #> 0.5556152 0.6299603# Set all_models = TRUE to see the performance of all trained models evaluate(models, all_models = TRUE)#> # A tibble: 2 x 3 #> model AUPR AUROC #> <chr> <dbl> <dbl> #> 1 Random Forest 0.556 0.630 #> 2 eXtreme Gradient Boosting 0.452 0.403# Can also get performance on a test dataset predictions <- predict(models, newdata = pima_diabetes[41:50, ])#>evaluate(predictions)#> AUPR AUROC #> 0.3444444 0.8095238