0.2.0

Latest

Latest

donishadsmith released this 15 Sep 20:58

· 1 commit to main since this release

[0.2.0]

Non-Development version of 0.2.0.
Minor internal code changes.
Additional testthat tests to assess outputs.
Removes explicit roxygen2 export of internal private functions.
Documents the contr.dummy function from the kknn package since train.kknn requires this function to be in the
current namespace to work.
Allows additional arguments for multiple algorithms to be used.

[0.2.0.9000] [Development]

♻ Changed

Refactored package internally to make code more reusable and maintainable. Version is in still in development but has
passed previous testthat tests and all functions can be used.
Some parameters for classCV and genFolds have been grouped together. For instance, split, n_folds, standardize,
remove_obs, random_seed, stratified, etc are no longer separate input parameters. They are now apart of the new
train_params parameter as elements (e.g train_params = list(split = 0.8, n_folds = 5, standardize = TRUE)). Additionally,
model_params, save, and parallel_configs were also created to group similar parameters in the former classCV
input parameters. For all functions, model_type is not models and for print, parameters is now configs.
classCV output object is more organized and includes "configs" for user specified arguments and model-specific arguments,
"class_summary" for information pertaining to classes such as the names of the classes, indices, proportions,
"data_partitions" to include the indices, class proportions in each split/fold, and dataframes if requested, "imputation"
for imputation information, "models" if models are requested, and "metrics" for metrics.
Can request final model with having to specify n_folds or split.

🐛 Fixes

The previous behavior were observations that are missing the target or excluded; however in addition to this,
when imputation is requested, the target variable is excluded from being a predictor for imputation.
Prior to imputation, regardless if standardization is requested, all numerical columns are standardized.
Error when saving plots in RStudio.
Metrics for latest GBM version should no longer produce NAs
[RE-UPLOAD]: Version that allows final model to run without specifying split or n_folds,
also includes additional tests. Still has the same version number - 0.2.0.9000. Re-upload also includes a fix for
logistic regression, since model_params$threshold was called instead of model_params$logistic_threshold, resulting in
the logistic threshold to be NULL and the prediction vector empty.

Assets 3