
Stepwise backward for logistic Firth regression with automated dummy variables conversion (with forced terms)
Source:R/step_bw_firth.R
step_bw_firth.RdExtension code to perform stepwise backward on a logistf model with categorical variables. Automatically transforms predictors that are factors into dummy variables.
Usage
step_bw_firth(
reg_model,
s_lower = "~1",
s_upper = "all",
trace = TRUE,
steps = NULL,
p_threshold = 0.05,
data = NULL,
forced = NULL
)Arguments
- reg_model
A logistf model object.
- s_lower
Lower step. Not used; included for compatibility. Default = "~1".
- s_upper
Upper step. Not used; included for compatibility. Default = "all".
- trace
Logical. Print each step. Default TRUE.
- steps
Maximum number of elimination steps. If NULL, set to number of predictors.
- p_threshold
P-value threshold for elimination. Default = 0.05.
- data
Data frame. If NULL, extracted from reg_model.
- forced
Character vector of term names to always keep. Default NULL.
Value
An object of class "step_bw" with components:
final_model: the fitted logistf model
steps: a data.frame of each elimination step
References
Heinze G, Ploner M, Jiricka L, Steiner G. logistf: Firth’s Bias-Reduced Logistic Regression. 2023.
Efroymson MA. Multiple regression analysis. In: Ralston A, Wilf HS, editors. Mathematical methods for digital computers. New York: Wiley; 1960.
Ullmann T, Heinze G, Hafermann L, Schilhart-Wallisch C, Dunkler D, et al. (2024) Evaluating variable selection methods for multivariable regression models: A simulation study protocol. PLOS ONE 19(8): e0308543
Examples
if (requireNamespace("logistf", quietly = TRUE)) {
library(logistf)
data <- mtcars
data$am <- as.factor(data$am)
regression_model <- logistf(am ~ mpg + cyl + disp, data = data)
# Perform backward stepwise, forcing 'cyl' to remain
stepwise <- step_bw_firth(
regression_model,
forced = c("cyl"),
p_threshold = 0.05,
trace = FALSE
)
final_model <- stepwise$final_model
# Show steps and summary
stepwise$steps
summary(final_model)
}
#> logistf::logistf(formula = new_formula, data = dataprov)
#>
#> Model fitted by Penalized ML
#> Coefficients:
#> coef se(coef) lower 0.95 upper 0.95 Chisq p
#> (Intercept) 0.8201147 1.8232397 -2.8921363 4.6412094266 0.1928272 0.66057392
#> cyl 0.5694572 0.6299805 -0.6794237 2.0101418441 0.7491163 0.38675617
#> disp -0.0221055 0.0120472 -0.0526668 -0.0004207133 4.0277557 0.04475744
#> method
#> (Intercept) 2
#> cyl 2
#> disp 2
#>
#> Method: 1-Wald, 2-Profile penalized log-likelihood, 3-None
#>
#> Likelihood ratio test=12.26211 on 2 df, p=0.002174283, n=32
#> Wald test = 7.246462 on 2 df, p = 0.02669628