Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

include_reference = TRUE erroneously works with datawizard::contr.deviation() #966

Merged
merged 8 commits into from
Oct 15, 2024
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Type: Package
Package: parameters
Title: Processing of Model Parameters
Version: 0.22.2.19
Version: 0.22.2.20
Authors@R:
c(person(given = "Daniel",
family = "Lüdecke",
Expand Down
4 changes: 4 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,10 @@
* `print()` for `compare_parameters()` now also puts factor levels into square
brackets, like the `print()` method for `model_parameters()`.

* `include_reference` now only adds the reference category of factors to the
parameters table when those factors have appropriate contrasts (treatment or
SAS contrasts).

## Bug fixes

* Arguments like `digits` etc. were ignored in `model_parameters() for objects
Expand Down
23 changes: 23 additions & 0 deletions R/utils.R
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,29 @@
}


# This functions finds contrasts for those factors in a model, where including
# a reference level makes sense. This is the case when there are contrasts
# that are all zeros, which means that the reference level is not included in
# the model matrix.
.remove_reference_contrasts <- function(model) {
cons <- .safe(model$contrasts)
if (is.null(cons)) {
return(NULL)
}
out <- vapply(cons, function(mat) {
if (is.matrix(mat) && nrow(mat) > 0) {
any(rowSums(mat) == 0)
} else if (is.character(mat)) {
mat %in% c("contr.treatment", "contr.SAS")
} else {
FALSE
}
}, logical(1))
# only return those factors that need to be removed
names(out)[!out]
}


# Almost identical to dynGet(). The difference is that we deparse the expression
# because get0() allows symbol only since R 4.1.0
.dynGet <- function(x,
Expand Down
8 changes: 8 additions & 0 deletions R/utils_format.R
Original file line number Diff line number Diff line change
Expand Up @@ -389,6 +389,14 @@
return(params)
}
}
# next, check contrasts of factors. including the reference level makes
# only sense if there are contrasts that are all zeros, which means that
# the reference level is not included in the model matrix
remove_contrasts <- .remove_reference_contrasts(model)
# keep only factors with valid contrasts
if (!is.null(remove_contrasts) && length(remove_contrasts)) {
factors <- factors[setdiff(names(factors), remove_contrasts)]
}

# we need some more information about prettified labels etc.
pretty_names <- attributes(params)$pretty_names
Expand Down Expand Up @@ -447,11 +455,11 @@
if (all(grepl(pattern_cut_right, pretty_level))) {
lower_bounds <- gsub(pattern_cut_right, "\\2", pretty_level)
upper_bounds <- gsub(pattern_cut_right, "\\3", pretty_level)
pretty_level <- gsub(pattern_cut_right, paste0("\\1>", as.numeric(lower_bounds), "-", upper_bounds, "]"), pretty_level)

Check warning on line 458 in R/utils_format.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=R/utils_format.R,line=458,col=121,[line_length_linter] Lines should not be more than 120 characters. This line is 127 characters.
} else if (all(grepl(pattern_cut_left, pretty_level))) {
lower_bounds <- gsub(pattern_cut_left, "\\2", pretty_level)
upper_bounds <- gsub(pattern_cut_left, "\\3", pretty_level)
pretty_level <- gsub(pattern_cut_left, paste0("\\1", as.numeric(lower_bounds), "-<", upper_bounds, "]"), pretty_level)

Check warning on line 462 in R/utils_format.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=R/utils_format.R,line=462,col=121,[line_length_linter] Lines should not be more than 120 characters. This line is 126 characters.
}
# insert new pretty level at the correct position in "pretty_names"
pretty_names <- .insert_element_at(
Expand Down Expand Up @@ -549,7 +557,7 @@

# helper to format the header / subheader of different model components --------------

.format_model_component_header <- function(x,

Check warning on line 560 in R/utils_format.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=R/utils_format.R,line=560,col=1,[cyclocomp_linter] Reduce the cyclomatic complexity of this function from 42 to at most 40.
type,
split_column,
is_zero_inflated,
Expand Down Expand Up @@ -918,7 +926,7 @@
# or edge cases...

#' @keywords internal
.format_columns_multiple_components <- function(x,

Check warning on line 929 in R/utils_format.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=R/utils_format.R,line=929,col=1,[cyclocomp_linter] Reduce the cyclomatic complexity of this function from 101 to at most 40.
pretty_names,
split_column = "Component",
digits = 2,
Expand Down
91 changes: 91 additions & 0 deletions tests/testthat/_snaps/include_reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,3 +64,94 @@
| Observations | 32 | 32 |
+--------------+----------------------+----------------------+

# include_reference, different contrasts

Code
print(out)
Output
Parameter | Coefficient | SE | 95% CI | t(27) | p
-------------------------------------------------------------------
(Intercept) | 19.70 | 1.18 | [ 17.28, 22.11] | 16.71 | < .001
cyl [6] | -6.66 | 1.63 | [-10.00, -3.31] | -4.09 | < .001
cyl [8] | -10.54 | 1.96 | [-14.56, -6.52] | -5.38 | < .001
gear [3] | 0.00 | | | |
gear [4] | 1.32 | 1.93 | [ -2.63, 5.28] | 0.69 | 0.498
gear [5] | 1.50 | 1.85 | [ -2.31, 5.31] | 0.81 | 0.426
Message

Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
using a Wald t-distribution approximation.

---

Code
print(out)
Output
Parameter | Coefficient | SE | 95% CI | t(27) | p
-------------------------------------------------------------------
(Intercept) | 25.43 | 1.88 | [ 21.57, 29.29] | 13.52 | < .001
cyl [4] | 0.00 | | | |
cyl [6] | -6.66 | 1.63 | [-10.00, -3.31] | -4.09 | < .001
cyl [8] | -10.54 | 1.96 | [-14.56, -6.52] | -5.38 | < .001
gear [3] | 0.00 | | | |
gear [4] | 1.32 | 1.93 | [ -2.63, 5.28] | 0.69 | 0.498
gear [5] | 1.50 | 1.85 | [ -2.31, 5.31] | 0.81 | 0.426
Message

Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
using a Wald t-distribution approximation.

---

Code
print(out)
Output
Parameter | Coefficient | SE | 95% CI | t(27) | p
-------------------------------------------------------------------
(Intercept) | 20.64 | 0.67 | [ 19.26, 22.01] | 30.76 | < .001
cyl [6] | -6.66 | 1.63 | [-10.00, -3.31] | -4.09 | < .001
cyl [8] | -10.54 | 1.96 | [-14.56, -6.52] | -5.38 | < .001
gear [1] | -0.94 | 1.09 | [ -3.18, 1.30] | -0.86 | 0.396
gear [2] | 0.38 | 1.11 | [ -1.90, 2.67] | 0.34 | 0.734
Message

Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
using a Wald t-distribution approximation.

---

Code
print(out)
Output
Parameter | Coefficient | SE | 95% CI | t(27) | p
------------------------------------------------------------------
(Intercept) | 15.83 | 1.24 | [13.28, 18.37] | 12.75 | < .001
cyl [8] | 0.00 | | | |
cyl [4] | 10.54 | 1.96 | [ 6.52, 14.56] | 5.38 | < .001
cyl [6] | 3.89 | 1.88 | [ 0.03, 7.75] | 2.07 | 0.049
gear [1] | -0.94 | 1.09 | [-3.18, 1.30] | -0.86 | 0.396
gear [2] | 0.38 | 1.11 | [-1.90, 2.67] | 0.34 | 0.734
Message

Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
using a Wald t-distribution approximation.

---

Code
print(out)
Output
Parameter | Coefficient | SE | 95% CI | t(27) | p
------------------------------------------------------------------
(Intercept) | 14.89 | 0.92 | [13.00, 16.77] | 16.19 | < .001
cyl [8] | 0.00 | | | |
cyl [4] | 10.54 | 1.96 | [ 6.52, 14.56] | 5.38 | < .001
cyl [6] | 3.89 | 1.88 | [ 0.03, 7.75] | 2.07 | 0.049
gear [3] | 0.00 | | | |
gear [4] | 1.32 | 1.93 | [-2.63, 5.28] | 0.69 | 0.498
gear [5] | 1.50 | 1.85 | [-2.31, 5.31] | 0.81 | 0.426
Message

Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
using a Wald t-distribution approximation.

47 changes: 47 additions & 0 deletions tests/testthat/test-include_reference.R
Original file line number Diff line number Diff line change
Expand Up @@ -56,3 +56,50 @@ test_that("include_reference, with pretty formatted cut", {
)
)
})

test_that("include_reference, different contrasts", {
data("mtcars")
mtcars$cyl <- factor(mtcars$cyl)
mtcars$gear <- factor(mtcars$gear)

m <- lm(mpg ~ cyl + gear, data = mtcars, contrasts = list(cyl = datawizard::contr.deviation))
out <- model_parameters(m, include_reference = TRUE)
expect_snapshot(print(out))

m <- lm(mpg ~ cyl + gear, data = mtcars)
out <- model_parameters(m, include_reference = TRUE)
expect_snapshot(print(out))

m <- lm(
mpg ~ cyl + gear,
data = mtcars,
contrasts = list(
cyl = datawizard::contr.deviation,
gear = contr.sum
)
)
out <- model_parameters(m, include_reference = TRUE)
expect_snapshot(print(out))

m <- lm(
mpg ~ cyl + gear,
data = mtcars,
contrasts = list(
cyl = contr.SAS,
gear = contr.sum
)
)
out <- model_parameters(m, include_reference = TRUE)
expect_snapshot(print(out))

m <- lm(
mpg ~ cyl + gear,
data = mtcars,
contrasts = list(
cyl = contr.SAS,
gear = contr.treatment
)
)
out <- model_parameters(m, include_reference = TRUE)
expect_snapshot(print(out))
})
Loading