Skip to content

Commit

Permalink
Refactored Package
Browse files Browse the repository at this point in the history
- Grouped related parameters in `classCV` and `genFolds` into `train_params`, `model_params`, and `parallel_configs`.
- Renamed `model_type` to `models` and `parameters` to `configs` in all functions.
- Improved `classCV` output structure: consists of `configs`, `class_summary`, `data_partitions`, `imputation`, `models`, and `metrics`.
- Final model can be requested without specifying `n_folds` or `split`.
- Ensure target variable is excluded from imputation predictors.
- Standardize all numerical columns before imputation, regardless of standardization request.
- Fixed error when saving plots in RStudio.
- Fixed metrics producing NAs with latest XGBoost
  • Loading branch information
donishadsmith committed Sep 9, 2024
1 parent 5772820 commit ee466f3
Show file tree
Hide file tree
Showing 28 changed files with 2,141 additions and 2,202 deletions.
22 changes: 22 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,28 @@ All notable future changes to vswift will be documented in this file.
noted in the changelog (i.e new functions or parameters, changes in parameter defaults or function names, etc).
- *.patch* : Contains no new features, simply fixes any identified bugs.


## [0.2.0.9000] - 2024-09-08 [Development Version]
### ♻ Changed
- Refactored package internally to make code more reusable and maintainable. Version is in still in development but has
passed previous testthat tests and all functions can be used.
- Some parameters for `classCV` and `genFolds` have been grouped together. For instance, `split`, `n_folds`, `standardize`,
`remove_obs`, `random_seed`, `stratified`, etc are no longer separate input parameters. They are now apart of the new
`train_params` parameter as elements (e.g `train_params = list(split = 0.8, n_folds = 5, standardize = TRUE))`. Additionally,
`model_params`, `save`, and `parallel_configs` were also created to group similar parameters in the former `classCV`
input parameters. For all functions, `model_type` is not `models` and for `print`, `parameters` is now `configs`.
- `classCV` output object is more organized and includes "configs" for user specified arguments and model-specific arguments,
"class_summary" for information pertaining to classes such as the names of the classes, indices, proportions,
"data_partitions" to include the indices, class proportions in each split/fold, and dataframes if requested, "imputation"
for imputation information, "models" if models are requested, and "metrics" for metrics.
- Can request final model with having to specify `n_folds` or `split`.
### 🐛 Fixes
- The previous behavior were observations that are missing the target or excluded; however in addition to this,
when imputation is requested, the target variable is excluded from being a predictor for imputation.
- Prior to imputation, regardless if standardization is requested, all numerical columns are standardized.
- Error when saving plots in RStudio.
- Metrics for latest GBM version should no longer produce NAs

## [0.1.4] - 2024-08-08 [RE-UPLOAD]
### 🐛 Fixes
- Added `plan(sequential)` so background workers don't stay up and continue to consume RAM.
Expand Down
Loading

0 comments on commit ee466f3

Please sign in to comment.