Skip to contents

Create AST recoding steps for categorical variables CORE AST SYSTEM: This function now uses Abstract Syntax Tree (AST) evaluation as its fundamental engine for all recoding conditions. All conditional expressions are parsed as AST, optimized, and evaluated with dependency validation.

Usage

step_recode(
  svy = survey_empty(),
  new_var,
  ...,
  .default = NA_character_,
  .name_step = NULL,
  ordered = FALSE,
  use_copy = use_copy_default(),
  comment = "AST Recode step",
  .to_factor = FALSE,
  .level = "auto",
  optimize_ast = TRUE,
  validate_deps = TRUE,
  cache_ast = TRUE
)

Arguments

svy

A Survey or RotativePanelSurvey object. If NULL, creates a step that can be applied later using the pipe operator (%>%)

new_var

Name of the new variable to create (unquoted)

...

Sequence of two-sided formulas defining recoding rules with AST parsing. Left-hand side (LHS) parsed as AST conditional expression, right-hand side (RHS) defines the replacement value. Format: ast_condition ~ value

.default

Default value assigned when no AST condition is met. Defaults to NA_character_

.name_step

Custom name for the step to identify it in the history. If not provided, generated automatically with "AST Recode" prefix

ordered

Logical indicating whether the new variable should be an ordered factor. Defaults to FALSE

use_copy

Logical indicating whether to create a copy of the object before applying transformations. Defaults to use_copy_default()

comment

Descriptive text for the step for documentation and traceability. Compatible with Markdown syntax. Defaults to "AST Recode step"

.to_factor

Logical indicating whether the new variable should be converted to a factor. Defaults to FALSE

.level

For RotativePanelSurvey objects, specifies the level where recoding is applied: "implantation", "follow_up", "quarter", "month", or "auto"

optimize_ast

Whether to apply AST optimizations to conditions. Default: TRUE

validate_deps

Whether to validate condition dependencies exist. Default: TRUE

cache_ast

Whether to cache compiled AST conditions for reuse. Default: TRUE

Value

Same type of input object (Survey or RotativePanelSurvey) with the new recoded variable and the step added to the history

Details

AST CORE ENGINE FOR RECODING:

1. AST Condition Parsing:

  • All LHS conditions converted to Abstract Syntax Trees

  • Static analysis of logical expressions

  • Dependency detection for all referenced variables

  • Optimization of conditional logic

2. Enhanced Condition Evaluation:

  • Conditions evaluated in order using AST engine

  • First matching AST condition determines assignment

  • Optimized short-circuit evaluation

  • Better error reporting with expression context

3. AST Optimization Features:

  • Constant folding in conditions (e.g., 5 + 3 > 78 > 7TRUE)

  • Dead code elimination for unreachable conditions

  • Expression simplification for faster evaluation

  • Dependency pre-validation prevents runtime errors

AST condition examples:

  • Simple: variable == 1 (parsed as AST, dependencies: variable)

  • Complex: age >= 18 & income > 1000 * 12 (optimized: income > 12000)

  • Vectorized: variable %in% c(1,2,3) (AST validates variable exists)

  • Logical: !is.na(education) & education > mean(education, na.rm = TRUE)

See also

step_compute for more complex calculations bake_steps to execute all pending steps get_steps to view step history

Examples

if (FALSE) { # \dontrun{
# Create labor force status variable
ech <- ech |>
  step_recode(
    labor_status,
    POBPCOAC == 2 ~ "Employed",
    POBPCOAC %in% 3:5 ~ "Unemployed", 
    POBPCOAC %in% 6:8 ~ "Inactive",
    .default = "Missing",
    comment = "Labor force status from ECH"
  )

# Create age groups
ech <- ech |>
  step_recode(
    age_group,
    e27 < 18 ~ "Under 18",
    e27 >= 18 & e27 < 65 ~ "Working age",
    e27 >= 65 ~ "Senior",
    .default = "Missing",
    .to_factor = TRUE,
    ordered = TRUE,
    comment = "Standard age groups"
  )

# Dummy variable
ech <- ech |>
  step_recode(
    household_head,
    e30 == 1 ~ 1,
    .default = 0,
    comment = "Household head indicator"
  )

# For rotative panel
panel <- panel |>
  step_recode(
    region_simple,
    REGION_4 == 1 ~ "Montevideo",
    REGION_4 != 1 ~ "Interior",
    .level = "implantation",
    comment = "Simplified region"
  )
} # }