Introduction
The convey package by
Guilherme Jacob, Anthony Damico, and Djalma Pessoa implements poverty
and inequality indicators for complex survey data. It works with
survey::svydesign objects — the same objects that
metasurvey wraps inside Survey objects.
This vignette shows how to use convey functions inside
workflow() to compute Gini coefficients, at-risk-of-poverty
rates, FGT indices, and other distributional measures, all with proper
standard errors and CVs.
For the full reference on every measure, see the convey book.
Setup
We use the api dataset from the survey
package. The api00 variable (Academic Performance Index
score in 2000) serves as our continuous variable for inequality
measures, and meals (percent of students eligible for
subsidized meals) works as an income-like proxy.
library(metasurvey)
library(survey)
library(convey)
library(data.table)
data(api, package = "survey")
dt <- data.table(apistrat)
svy <- Survey$new(
data = dt,
edition = "2000",
type = "api",
psu = NULL,
engine = "data.table",
weight = add_weight(annual = "pw")
)Preparing the design for convey
Before using any convey function, the underlying design
must be prepared with convey_prep(). Build the design with
ensure_design() and then replace the estimation-type
entry:
svy$ensure_design()
svy$design[["annual"]] <- convey_prep(svy$design[["annual"]])Inequality Measures
Gini coefficient
The Gini index measures overall inequality on a 0–1 scale:
gini <- workflow(
list(svy),
convey::svygini(~api00, na.rm = TRUE),
estimation_type = "annual"
)
gini
#> stat variable value se cv
#> <char> <char> <num> <num> <num>
#> 1: convey::svygini: gini gini 0.1123906 0.004824568 0.04292681
#> confint_lower confint_upper evaluate
#> <num> <num> <char>
#> 1: 0.1029346 0.1218465 ExcellentAtkinson index
The Atkinson index uses an inequality aversion parameter
epsilon. Higher epsilon gives more weight to the lower
tail:
atk_05 <- workflow(
list(svy),
convey::svyatk(~api00, epsilon = 0.5),
estimation_type = "annual"
)
atk_1 <- workflow(
list(svy),
convey::svyatk(~api00, epsilon = 1),
estimation_type = "annual"
)
rbind(atk_05, atk_1)
#> stat variable value se cv
#> <char> <char> <num> <num> <num>
#> 1: convey::svyatk: atkinson atkinson 0.008841101 0.0007781485 0.08801488
#> 2: convey::svyatk: atkinson atkinson 0.017852866 0.0015768947 0.08832726
#> confint_lower confint_upper evaluate
#> <num> <num> <char>
#> 1: 0.007315958 0.01036624 Very good
#> 2: 0.014762210 0.02094352 Very goodQuintile share ratio (QSR)
The QSR compares income at the top 20% with the bottom 20%:
qsr <- workflow(
list(svy),
convey::svyqsr(~api00, na.rm = TRUE),
estimation_type = "annual"
)
qsr
#> stat variable value se cv confint_lower
#> <char> <char> <num> <num> <num> <num>
#> 1: convey::svyqsr: qsr qsr 1.565964 0.0355218 0.02268367 1.496342
#> confint_upper evaluate
#> <num> <char>
#> 1: 1.635585 ExcellentGeneralized entropy index
The GEI family includes the Theil index (alpha = 1) and
the mean log deviation (alpha = 0):
theil <- workflow(
list(svy),
convey::svygei(~api00, epsilon = 1),
estimation_type = "annual"
)
mld <- workflow(
list(svy),
convey::svygei(~api00, epsilon = 0),
estimation_type = "annual"
)
rbind(theil, mld)
#> stat variable value se cv confint_lower
#> <char> <char> <num> <num> <num> <num>
#> 1: convey::svygei: gei gei 0.01749577 0.001533703 0.08766137 0.01448977
#> 2: convey::svygei: gei gei 0.01801415 0.001605559 0.08912763 0.01486731
#> confint_upper evaluate
#> <num> <char>
#> 1: 0.02050177 Very good
#> 2: 0.02116099 Very goodPoverty Measures
For poverty measures we use meals (percent of students
receiving subsidized meals) as an income-like variable. We define a
poverty threshold at 50%.
At-risk-of-poverty threshold
svyarpt() computes the at-risk-of-poverty threshold (60%
of the median by default):
arpt <- workflow(
list(svy),
convey::svyarpt(~meals, na.rm = TRUE),
estimation_type = "annual"
)
arpt
#> stat variable value se cv confint_lower
#> <char> <char> <num> <num> <num> <num>
#> 1: convey::svyarpt: arpt arpt 27 2.051721 0.07598967 22.9787
#> confint_upper evaluate
#> <num> <char>
#> 1: 31.0213 Very goodAt-risk-of-poverty rate
svyarpr() computes the proportion of units below the
ARPT:
arpr <- workflow(
list(svy),
convey::svyarpr(~meals, na.rm = TRUE),
estimation_type = "annual"
)
arpr
#> stat variable value se cv confint_lower
#> <char> <char> <num> <num> <num> <num>
#> 1: convey::svyarpr: arpr arpr 0.2974169 0.02696583 0.09066677 0.2445648
#> confint_upper evaluate
#> <num> <char>
#> 1: 0.3502689 Very goodFGT poverty indices
The Foster-Greer-Thorbecke (FGT) family provides:
- FGT(0): headcount ratio (proportion below the line)
- FGT(1): poverty gap (average depth of poverty)
- FGT(2): severity (squared poverty gap, penalizes extreme poverty)
threshold <- 50
fgt0 <- workflow(
list(svy),
convey::svyfgt(~meals, g = 0, abs_thresh = threshold, na.rm = TRUE),
estimation_type = "annual"
)
fgt1 <- workflow(
list(svy),
convey::svyfgt(~meals, g = 1, abs_thresh = threshold, na.rm = TRUE),
estimation_type = "annual"
)
fgt2 <- workflow(
list(svy),
convey::svyfgt(~meals, g = 2, abs_thresh = threshold, na.rm = TRUE),
estimation_type = "annual"
)
rbind(fgt0, fgt1, fgt2)
#> stat variable value se cv confint_lower
#> <char> <char> <num> <num> <num> <num>
#> 1: convey::svyfgt: fgt0 fgt0 0.5590055 0.03854638 0.06895528 0.4834560
#> 2: convey::svyfgt: fgt1 fgt1 0.2733427 0.02456407 0.08986547 0.2251980
#> 3: convey::svyfgt: fgt2 fgt2 0.1795022 0.02043659 0.11385149 0.1394472
#> confint_upper evaluate
#> <num> <char>
#> 1: 0.6345550 Very good
#> 2: 0.3214874 Very good
#> 3: 0.2195572 GoodFull Pipeline: Steps + Convey
A complete pipeline with data transformations followed by inequality estimation:
dt_full <- data.table(apistrat)
svy_full <- Survey$new(
data = dt_full,
edition = "2000",
type = "api",
psu = NULL,
engine = "data.table",
weight = add_weight(annual = "pw")
)
# Transform: compute a derived variable
svy_full <- step_compute(svy_full,
api_growth = api00 - api99,
comment = "API score growth"
)
# Bake the steps
svy_full <- bake_steps(svy_full)
# Prepare for convey
svy_full$ensure_design()
svy_full$design[["annual"]] <- convey_prep(svy_full$design[["annual"]])
# Inequality: Gini on derived variable, Atkinson on api00 (must be positive)
results <- workflow(
list(svy_full),
convey::svygini(~api_growth, na.rm = TRUE),
convey::svyatk(~api00, epsilon = 1),
estimation_type = "annual"
)
results
#> stat variable value se cv
#> <char> <char> <num> <num> <num>
#> 1: convey::svygini: gini gini 0.48220882 0.033233109 0.06891850
#> 2: convey::svyatk: atkinson atkinson 0.01785287 0.001576895 0.08832726
#> confint_lower confint_upper evaluate
#> <num> <num> <char>
#> 1: 0.41707312 0.54734451 Very good
#> 2: 0.01476221 0.02094352 Very goodQuality assessment
for (i in seq_len(nrow(results))) {
cv_val <- results$cv[i] * 100
cat(
results$stat[i], ":",
round(cv_val, 1), "% CV -",
evaluate_cv(cv_val), "\n"
)
}
#> convey::svygini: gini : 6.9 % CV - Very good
#> convey::svyatk: atkinson : 8.8 % CV - Very goodPublication table
workflow_table(
results,
title = "Inequality of API Score Growth",
subtitle = "California Schools, 2000"
)| Inequality of API Score Growth | |||||||
| California Schools, 2000 | |||||||
| Statistic | variable | Estimate | SE | CI Lower | CI Upper | CV (%) | Quality |
|---|---|---|---|---|---|---|---|
| :svygini: gini | gini | 0.48 | 0.033 | 0.42 | 0.55 | 6.9 | Very good |
| :svyatk: atkinson | atkinson | 0.02 | 0.002 | 0.01 | 0.02 | 8.8 | Very good |
| metasurvey 0.0.23 | CI: 95% | 2026-04-23 | |||||||
Provenance
Provenance is tracked automatically. The full lineage — steps applied, convey estimates computed, and package versions — is available:
prov <- provenance(results)
prov
#> ── Data Provenance ─────────────────────────────────────────────────────────────
#> Loaded: 2026-04-23T01:10:10
#> Initial rows: 200
#>
#> Pipeline:
#> 1. step_1 Compute: api_growth N=200 [1.0ms]
#>
#> Estimation:
#> Type: annual
#> Timestamp: 2026-04-23T01:10:10
#>
#> Environment:
#> metasurvey: 0.0.23
#> R: 4.5.3
#> survey: 4.5
cat("metasurvey version:", prov$environment$metasurvey_version, "\n")
#> metasurvey version: 0.0.23
cat("Steps applied:", length(prov$steps), "\n")
#> Steps applied: 1References
- Jacob, G., Damico, A., & Pessoa, D. (2024). Poverty and Inequality with Complex Survey Data. https://www.convey-r.org/
- Lumley, T. (2010). Complex Surveys: A Guide to Analysis Using R. Wiley.