
Create AST recoding steps for categorical variables CORE AST SYSTEM: This function now uses Abstract Syntax Tree (AST) evaluation as its fundamental engine for all recoding conditions. All conditional expressions are parsed as AST, optimized, and evaluated with dependency validation.
Source:R/steps.R
step_recode.Rd
Create AST recoding steps for categorical variables CORE AST SYSTEM: This function now uses Abstract Syntax Tree (AST) evaluation as its fundamental engine for all recoding conditions. All conditional expressions are parsed as AST, optimized, and evaluated with dependency validation.
Usage
step_recode(
svy = survey_empty(),
new_var,
...,
.default = NA_character_,
.name_step = NULL,
ordered = FALSE,
use_copy = use_copy_default(),
comment = "AST Recode step",
.to_factor = FALSE,
.level = "auto",
optimize_ast = TRUE,
validate_deps = TRUE,
cache_ast = TRUE
)
Arguments
- svy
A
Survey
orRotativePanelSurvey
object. If NULL, creates a step that can be applied later using the pipe operator (%>%)- new_var
Name of the new variable to create (unquoted)
- ...
Sequence of two-sided formulas defining recoding rules with AST parsing. Left-hand side (LHS) parsed as AST conditional expression, right-hand side (RHS) defines the replacement value. Format:
ast_condition ~ value
- .default
Default value assigned when no AST condition is met. Defaults to
NA_character_
- .name_step
Custom name for the step to identify it in the history. If not provided, generated automatically with "AST Recode" prefix
- ordered
Logical indicating whether the new variable should be an ordered factor. Defaults to FALSE
- use_copy
Logical indicating whether to create a copy of the object before applying transformations. Defaults to
use_copy_default()
- comment
Descriptive text for the step for documentation and traceability. Compatible with Markdown syntax. Defaults to "AST Recode step"
- .to_factor
Logical indicating whether the new variable should be converted to a factor. Defaults to FALSE
- .level
For RotativePanelSurvey objects, specifies the level where recoding is applied: "implantation", "follow_up", "quarter", "month", or "auto"
- optimize_ast
Whether to apply AST optimizations to conditions. Default: TRUE
- validate_deps
Whether to validate condition dependencies exist. Default: TRUE
- cache_ast
Whether to cache compiled AST conditions for reuse. Default: TRUE
Value
Same type of input object (Survey
or RotativePanelSurvey
)
with the new recoded variable and the step added to the history
Details
AST CORE ENGINE FOR RECODING:
1. AST Condition Parsing:
All LHS conditions converted to Abstract Syntax Trees
Static analysis of logical expressions
Dependency detection for all referenced variables
Optimization of conditional logic
2. Enhanced Condition Evaluation:
Conditions evaluated in order using AST engine
First matching AST condition determines assignment
Optimized short-circuit evaluation
Better error reporting with expression context
3. AST Optimization Features:
Constant folding in conditions (e.g.,
5 + 3 > 7
→8 > 7
→TRUE
)Dead code elimination for unreachable conditions
Expression simplification for faster evaluation
Dependency pre-validation prevents runtime errors
AST condition examples:
Simple:
variable == 1
(parsed as AST, dependencies:variable
)Complex:
age >= 18 & income > 1000 * 12
(optimized:income > 12000
)Vectorized:
variable %in% c(1,2,3)
(AST validatesvariable
exists)Logical:
!is.na(education) & education > mean(education, na.rm = TRUE)
See also
step_compute
for more complex calculations
bake_steps
to execute all pending steps
get_steps
to view step history
Examples
if (FALSE) { # \dontrun{
# Create labor force status variable
ech <- ech |>
step_recode(
labor_status,
POBPCOAC == 2 ~ "Employed",
POBPCOAC %in% 3:5 ~ "Unemployed",
POBPCOAC %in% 6:8 ~ "Inactive",
.default = "Missing",
comment = "Labor force status from ECH"
)
# Create age groups
ech <- ech |>
step_recode(
age_group,
e27 < 18 ~ "Under 18",
e27 >= 18 & e27 < 65 ~ "Working age",
e27 >= 65 ~ "Senior",
.default = "Missing",
.to_factor = TRUE,
ordered = TRUE,
comment = "Standard age groups"
)
# Dummy variable
ech <- ech |>
step_recode(
household_head,
e30 == 1 ~ 1,
.default = 0,
comment = "Household head indicator"
)
# For rotative panel
panel <- panel |>
step_recode(
region_simple,
REGION_4 == 1 ~ "Montevideo",
REGION_4 != 1 ~ "Interior",
.level = "implantation",
comment = "Simplified region"
)
} # }