Model Training

machine_learn()

Machine learning made easy

tune_models()

Tune multiple machine learning models using cross validation to optimize performance

flash_models()

Train models without tuning for performance

Prediction

predict(<model_list>)

Make predictions using the best-performing model

Model Interpretation

get_variable_importance()

Get variable importances

interpret()

Interpret a model via regularized coefficient estimates

evaluate()

Get model performance metrics

Visualization

plot(<model_list>)

Plot performance of models

plot(<predicted_df>) plot_regression_predictions() plot_classification_predictions()

Plot model predictions vs observed outcomes

plot(<variable_importance>)

Plot variable importance

plot(<missingness>)

Plot missingness

control_chart()

Create a control chart

Data Preparation

add_best_levels() get_best_levels()

Build efficient features from from high-cardinality, multiple-membership factors

prep_data()

Prepare data for machine learning

impute()

Impute data and return a reusable recipe

split_train_test()

Split data into training and test data frames

Data Manipulation

pivot()

Pivot multiple rows per observation to one row with multiple columns

separate_drgs()

Convert MSDRGs into a "base DRG" and complication level

add_SAM_utility_cols()

Add SAM utility columns to table

Data Profiling

missingness()

Find missingness in each column and search for strings that might represent missing values

Connect to Databases

build_connection_string()

Build a connection string for use with MSSQL and dbConnect

db_read()

Read from a SQL Server database table

Save and Load Models

save_models() load_models()

Save models to disk and load models from disk

Example Data

pima_diabetes

Patient diabetes dataset

pima_meds

Patient medications dataset

Model Details

get_supported_models()

Supported models and their hyperparameters

get_hyperparameter_defaults() get_random_hyperparameters()

Get hyperparameter values