Random Forest: "rf". Regression and classification. Implemented via ranger.

  • mtry: Number of variables to consider for each split

  • splitrule: Splitting rule. For classification either "gini" or "extratrees". For regression either "variance" or "extratrees".

  • min.node.size: Minimal node size.

XGBoost: "xgb". eXtreme Gradient Boosting Implemented via xgboost. Note that XGB has many more hyperparameters than the other models. Because of this, it may require greater tune_depth to optimize performance.

  • eta: Control for learning rate, [0, 1]

  • gamma: Threshold for further cutting of leaves, [0, Inf]. Larger is more conservative.

  • max_depth: Maximum tree depth, [0, Inf]. Larger means more complex models and so greater likelihood of overfitting. 0 produces no limit on depth.

  • subsample: Fraction of data to use in each training instance, (0, 1].

  • colsample_bytree: Fraction of features to use in each tree, (0, 1].

  • min_child_weight: Minimum sum of instance weight need to keep partitioning, [0, Inf]. Larger values mean more conservative models.

  • nrounds: Number of rounds of boosting, [0, Inf). Larger values produce a greater likelihood of overfitting.

Regularized regression: "glm". Regression and classification. Implemented via glmnet.

  • alpha: Elasticnet mixing parameter, in [0, 1]. 0 = ridge regression; 1 = lasso.

  • lambda: Regularization parameter, > 0. Larger values make for stronger regularization.



Vector of currently-supported algorithms.

See also

hyperparameters for more detail on hyperparameter defaults and specifications