Overview
metasurvey provides a REST API built with plumber backed by MongoDB for
sharing recipes, workflows, and variable metadata with the community.
The API can be self-hosted (see vignette("self-hosting"))
and is used by both the R client functions (api_*) and the
Shiny exploration application.
After deploying, the Swagger UI interface at
<your-api-url>/__docs__/ provides an interactive
endpoint explorer automatically generated by plumber. For detailed
request/response schemas and MongoDB collection documentation, see the
sections below.
Configuration
library(metasurvey)
# Point to your self-hosted API
configure_api("https://your-api-host.example.com")
# Or use an environment variable
Sys.setenv(METASURVEY_API_URL = "https://your-api-host.example.com")The R client reads the URL first from configure_api(),
then falls back to the METASURVEY_API_URL environment
variable.
Authentication
The API uses JWT (JSON Web Token) authentication with HMAC-SHA256 signing. Tokens expire after 24 hours; long-lived tokens (90 days) can be generated for automated scripts.
Registration
# Individual account (auto-approved)
api_register("Ana Garcia", "ana@example.com", "password123")
# Institutional member (requires admin review)
api_register(
"Carlos Rodriguez",
"carlos@ine.gub.uy",
"password123",
user_type = "institutional_member",
institution = "INE Uruguay"
)Account types:
| Type | Description | Approval |
|---|---|---|
individual |
Independent researcher | Automatic |
institutional_member |
Member of a recognized institution | Requires admin review |
institution |
Institutional account | Requires admin review |
Login
api_login("ana@example.com", "password123")The token is stored in the session and used automatically in subsequent API calls. The client automatically renews tokens within 5 minutes of their expiration.
Session Management
# View current user profile
api_me()
# Refresh token
api_refresh_token()
# Logout
api_logout()Long-lived Tokens
For automated scripts and CI/CD, generate a 90-day token from the Shiny application (Profile tab) or use it directly:
Sys.setenv(METASURVEY_TOKEN = "your-long-lived-token")
# API calls work without interactive login
recipes <- api_list_recipes(survey_type = "ech")API Endpoints
Recipes
| Method | Endpoint | Auth | Description |
|---|---|---|---|
GET |
/recipes |
No | List and search recipes |
GET |
/recipes/:id |
No | Get an individual recipe |
POST |
/recipes |
Yes | Publish a new recipe |
POST |
/recipes/:id/download |
No | Increment download counter |
List Recipes
# All recipes
all <- api_list_recipes()
# Filter by survey type
ech <- api_list_recipes(survey_type = "ech")
# Search by text
labor <- api_list_recipes(search = "empleo")
# Filter by topic
income <- api_list_recipes(topic = "income")
# Filter by certification level
official <- api_list_recipes(certification = "official")
# Pagination
page2 <- api_list_recipes(limit = 10, offset = 10)Query parameters:
| Parameter | Type | Description |
|---|---|---|
search |
string | Regex search on recipe name |
survey_type |
string |
ech, eaii, eph,
eai
|
topic |
string |
labor_market, income,
education, health, demographics,
housing
|
certification |
string |
community, reviewed,
official
|
user |
string | Filter by author email |
limit |
integer | Maximum results (default 50) |
offset |
integer | Skip N results (default 0) |
Get Recipe
recipe <- api_get_recipe("ech_employment_001")Publish Recipe
api_login("ana@example.com", "password123")
api_publish_recipe(my_recipe)The server automatically sets the user field from the
JWT, initializes downloads = 0, generates an
id if not provided, and assigns the community
certification by default.
Workflows
| Method | Endpoint | Auth | Description |
|---|---|---|---|
GET |
/workflows |
No | List and search workflows |
GET |
/workflows/:id |
No | Get an individual workflow |
POST |
/workflows |
Yes | Publish a new workflow |
POST |
/workflows/:id/download |
No | Increment download counter |
# List workflows for ECH
wf <- api_list_workflows(survey_type = "ech")
# Find workflows that use a specific recipe
wf <- api_list_workflows(recipe_id = "ech_employment_001")
# Get specific workflow
w <- api_get_workflow("wf_labor_market_001")
# Publish
api_publish_workflow(my_workflow)ANDA Variable Metadata
Note: The ANDA integration is an unofficial implementation that parses DDI XML metadata from INE Uruguay’s public ANDA catalog. It is not endorsed by INE and may contain errors or become outdated if INE changes the catalog structure. Always verify critical variable definitions against the official codebook.
The /anda/variables endpoint provides variable metadata
obtained from INE Uruguay’s ANDA catalog (DDI XML format). This includes
variable labels, value categories, and type information.
| Method | Endpoint | Auth | Description |
|---|---|---|---|
GET |
/anda/variables |
No | Get variable metadata |
# Get all ECH variables
vars <- api_get_anda_variables(survey_type = "ech")
# Get specific variables
vars <- api_get_anda_variables(
survey_type = "ech",
var_names = c("pobpcoac", "e27", "ht11")
)Query parameters:
| Parameter | Type | Description |
|---|---|---|
survey_type |
string | Survey type (default "ech") |
names |
string | Comma-separated variable names (all if empty) |
Each variable document contains:
| Field | Description |
|---|---|
name |
Variable name (lowercase) |
label |
Human-readable label |
type |
discrete, continuous, or
unknown
|
value_labels |
List of code-label mappings |
description |
Extended description |
source_edition |
Survey edition (e.g., "2024") |
source_catalog_id |
ANDA catalog ID (e.g., 767) |
Administration
| Method | Endpoint | Auth | Description |
|---|---|---|---|
GET |
/admin/pending-users |
Admin | List institutional accounts pending review |
POST |
/admin/approve/:email |
Admin | Approve an institutional account |
POST |
/admin/reject/:email |
Admin | Reject an institutional account |
Admin access is controlled via the
METASURVEY_ADMIN_EMAIL environment variable on the
server.
MongoDB Schema
The database has four collections, each with JSON Schema validation and optimized indexes.
Entity-Relationship Diagram
The following diagram shows the MongoDB collections and their relationships:
┌──────────────────┐ ┌──────────────────────┐
│ users │ │ recipes │
├──────────────────┤ ├──────────────────────┤
│ email (PK) │──┐ │ id (PK) │
│ name │ │ │ name │
│ password_hash │ ├───>│ user (FK) │
│ user_type │ │ │ survey_type │
│ institution │ │ │ edition │
└──────────────────┘ │ │ steps[] │
│ │ certification{} │
│ │ categories[] │
│ └──────────┬───────────┘
│ │
│ ┌──────────┴───────────┐
│ │ workflows │
│ ├──────────────────────┤
│ │ id (PK) │
└───>│ user (FK) │
│ survey_type │
│ recipe_ids[] (FK) │
│ calls[] │
└──────────────────────┘
┌──────────────────────┐
│ anda_variables │
├──────────────────────┤
│ survey_type (PK) │
│ name (PK) │
│ label │
│ type │
│ value_labels{} │
└──────────────────────┘
Relationships:
users ──1:N──> recipes (publishes)
users ──1:N──> workflows (publishes)
recipes ──1:N──> workflows (referenced by)
Collections
users
| Field | Type | Required | Description |
|---|---|---|---|
name |
string | Yes | Display name |
email |
string | Yes | Email (unique, validated) |
password_hash |
string | Yes | SHA-256 hash (64 characters) |
user_type |
enum | Yes |
individual, institutional_member,
institution
|
institution |
string | No | Institution name |
verified |
boolean | No | Whether identity is verified |
review_status |
enum | No |
approved, pending,
rejected
|
reviewed_by |
string | No | Reviewing admin’s email |
reviewed_at |
string | No | ISO timestamp |
created_at |
string | Yes | ISO timestamp |
Indexes: unique on email.
recipes
| Field | Type | Required | Description |
|---|---|---|---|
id |
string | No | Unique identifier (auto-generated) |
name |
string | Yes | Recipe name |
user |
string | Yes | Author email |
survey_type |
enum | Yes |
ech, eaii, eph,
eai
|
edition |
string/array | No | Survey edition(s) |
description |
string | No | Description |
topic |
enum | No |
labor_market, income,
education, health, demographics,
housing
|
version |
string | No | Semantic version (default "1.0.0") |
downloads |
number | No | Download counter (default 0) |
steps |
array | No | Step expressions as strings |
depends_on |
array | No | Required input variable names |
depends_on_recipes |
array | No | IDs of dependent recipes |
categories |
array | No | Category objects |
certification |
object | No | {level, certified_at, certified_by, notes} |
user_info |
object | No | {name, user_type, email, url, verified} |
doc |
object | No | {input_variables, output_variables, pipeline} |
data_source |
object | No | {s3_bucket, s3_prefix, file_pattern, provider} |
Indexes: unique on id; on
user, survey_type, topic,
downloads (desc), certification.level;
compound on (survey_type, edition); text search on
(name, description, topic).
workflows
| Field | Type | Required | Description |
|---|---|---|---|
id |
string | No | Unique identifier (auto-generated) |
name |
string | Yes | Workflow name |
user |
string | Yes | Author email |
survey_type |
enum | Yes |
ech, eaii, eph,
eai
|
edition |
string/array | No | Survey edition(s) |
description |
string | No | Description |
version |
string | No | Semantic version |
downloads |
number | No | Download counter |
estimation_type |
string/array | No |
annual, quarterly,
monthly
|
recipe_ids |
array | No | Referenced recipe IDs |
calls |
array | No | Estimation calls as strings |
call_metadata |
array | No | Call descriptions |
categories |
array | No | Category objects |
certification |
object | No | Same as recipes |
user_info |
object | No | Same as recipes |
Indexes: unique on id; on
user, survey_type, recipe_ids,
downloads (desc); compound on
(survey_type, edition); text search on
(name, description).
anda_variables
| Field | Type | Required | Description |
|---|---|---|---|
survey_type |
string | Yes | Survey type |
name |
string | Yes | Variable name (lowercase) |
label |
string | Yes | Human-readable label |
type |
enum | No |
discrete, continuous,
unknown
|
value_labels |
object | No | Code-label mappings |
description |
string | No | Extended description |
source_edition |
string | No | Edition (e.g., "2024") |
source_catalog_id |
number | No | ANDA catalog ID |
Indexes: compound unique on
(survey_type, name); on survey_type.
Database Setup
To set up the database on a new deployment:
# 1. Create collections with JSON Schema validation and indexes
mongosh "$METASURVEY_MONGO_URI" inst/scripts/setup_mongodb.js
# 2. Seed recipes, workflows, and users
METASURVEY_MONGO_URI="..." Rscript inst/scripts/seed_ech_recipes.R
# 3. Seed ANDA variable metadata from INE catalog
METASURVEY_MONGO_URI="..." Rscript inst/scripts/seed_anda_metadata.RThe setup script creates the four collections and builds the indexes. It is idempotent: existing collections are skipped.
Server Deployment
Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
METASURVEY_MONGO_URI |
Yes | — | MongoDB connection string |
METASURVEY_DB |
No | metasurvey |
Database name |
METASURVEY_JWT_SECRET |
No | metasurvey-dev-secret-... |
JWT signing secret (override in production) |
METASURVEY_ADMIN_EMAIL |
No | — | Admin email for institutional review |
Running Locally
METASURVEY_MONGO_URI="mongodb+srv://user:pass@cluster.mongodb.net" \
Rscript -e 'plumber::plumb("inst/api/plumber.R")$run(port = 8787)'The Swagger UI interface will be available at
http://localhost:8787/__docs__/.
CORS
The API allows cross-origin requests from any origin:
- Allowed methods: GET, POST, OPTIONS
- Allowed headers: Content-Type, Authorization
Next Steps
- Interactive recipe explorer – Browse recipes and workflows through the Shiny web application
- Creating and publishing recipes – Build recipes programmatically and publish them to the API
-
Estimation
workflows – Compute weighted survey estimates with
workflow()