Performance Tuning¶
This guide covers tuning parameters for both DDI Codebook and DDI-L FragmentInstance ingestion.
Quick Reference¶
| Parameter | Default | DDI Codebook | DDI-L FragmentInstance |
|---|---|---|---|
chunk_size |
200 | Records per batch | Fragments per batch |
queue_maxsize |
2 | Batches buffered | N/A (sync parsing) |
writer_concurrency |
4 | Parallel writers | N/A (sequential batches) |
write_retry_attempts |
3 | Retry count | Retry count |
Batch Size (DDIGRAPH_CHUNK_SIZE)¶
Controls how many records/fragments are grouped before writing to Neo4j.
DDI Codebook¶
Larger batches reduce transaction overhead but increase per-transaction latency. The threshold counts all parsed record types together (variables, questions, categories, etc.):
# Larger batches for high-throughput ingestion
export DDIGRAPH_CHUNK_SIZE=1000
# Smaller batches for memory-constrained environments
export DDIGRAPH_CHUNK_SIZE=100
For codebooks with mixed metadata and variables, 500-1000 combined records keeps writes efficient.
DDI-L FragmentInstance¶
Fragments are batched by element type before UNWIND writes:
# For large FragmentInstance files (>5000 fragments)
ddigraph load questionnaire.xml --chunk-size 500
# For smaller files or debugging
ddigraph load questionnaire.xml --chunk-size 50
Recommendation: Start with 200-300 for FragmentInstance files and adjust based on memory and write latency.
Queue Size (DDIGRAPH_QUEUE_MAXSIZE)¶
Applies to DDI Codebook only.
Controls back-pressure between parsing and writing by capping queued DDIBatch objects:
# Keep writer busy on fast systems
export DDIGRAPH_QUEUE_MAXSIZE=4
# Reduce memory usage
export DDIGRAPH_QUEUE_MAXSIZE=1
The FragmentInstance loader uses synchronous batch parsing, so queue size doesn't apply.
Writer Concurrency (DDIGRAPH_WRITER_CONCURRENCY)¶
Applies to DDI Codebook only.
Sets how many batches flush to Neo4j in parallel:
# Higher concurrency for fast Neo4j clusters
export DDIGRAPH_WRITER_CONCURRENCY=4
# Single writer for debugging or constrained connections
export DDIGRAPH_WRITER_CONCURRENCY=1
Ensure max_connection_pool_size >= writer_concurrency to avoid writer starvation.
Driver Pooling¶
Connection Pool Size¶
# Match or exceed writer concurrency
export DDIGRAPH_MAX_CONNECTION_POOL_SIZE=10
export DDIGRAPH_WRITER_CONCURRENCY=4
Timeouts¶
| Setting | Purpose | Recommendation |
|---|---|---|
connection_timeout |
Time to establish connection | 5-30s |
max_connection_lifetime |
Recycle idle connections | 300-3600s |
session_timeout |
Session lifetime | Match transaction needs |
transaction_timeout |
Server-side transaction limit | 30-120s |
# Fast failure on connection issues
export DDIGRAPH_CONNECTION_TIMEOUT=5
# Long-running transactions for large batches
export DDIGRAPH_TRANSACTION_TIMEOUT=60
Retry Configuration¶
Both loaders support exponential backoff with jitter for transient failures:
# Aggressive retries for unstable networks
export DDIGRAPH_WRITE_RETRY_ATTEMPTS=5
export DDIGRAPH_WRITE_RETRY_BASE_DELAY=1.0
export DDIGRAPH_WRITE_RETRY_JITTER=0.5
# Fast failure for stable environments
export DDIGRAPH_WRITE_RETRY_ATTEMPTS=2
export DDIGRAPH_WRITE_RETRY_BASE_DELAY=0.1
export DDIGRAPH_WRITE_RETRY_JITTER=0
CLI equivalent:
ddigraph load file.xml --dataset-id demo \
--write-retry-attempts 5 \
--write-retry-base-delay 1.0 \
--write-retry-jitter 0.5
Dry Run and Replace¶
Validation Mode¶
Parse and plan without writing:
# Validate before production ingestion
ddigraph load file.xml --dataset-id demo --dry-run
# Or via environment
DDIGRAPH_DRY_RUN=true ddigraph load file.xml --dataset-id demo
Replace Mode¶
Purge existing data before loading:
# Re-ingest a dataset from scratch
ddigraph load file.xml --dataset-id demo --replace
# Replace is skipped during dry-run
ddigraph load file.xml --dataset-id demo --dry-run --replace # No purge occurs
Format-Specific Tuning¶
DDI-C¶
Optimize for the producer/consumer pipeline:
# High-throughput configuration
export DDIGRAPH_CHUNK_SIZE=1000
export DDIGRAPH_QUEUE_MAXSIZE=4
export DDIGRAPH_WRITER_CONCURRENCY=4
export DDIGRAPH_MAX_CONNECTION_POOL_SIZE=10
DDI-L Fragment Inst¶
Optimize for batched UNWIND writes:
# Large FragmentInstance files
export DDIGRAPH_CHUNK_SIZE=500
export DDIGRAPH_TRANSACTION_TIMEOUT=60
# Memory-constrained environments
export DDIGRAPH_CHUNK_SIZE=100
Monitoring¶
Enable batch metrics for visibility:
ddigraph load file.xml --dataset-id demo --batch-metrics
This emits timing and count metrics that can be captured by observability systems.
Useful Metrics¶
| Metric | Description |
|---|---|
batch_duration_seconds |
Time per batch write |
batch_size |
Records/fragments per batch |
batches |
Total batches processed |
batch_write_retries |
Retry count |
Worked Examples¶
Large DDI Codebook (10K+ variables)¶
export DDIGRAPH_NEO4J_URI=bolt://cluster:7687
export DDIGRAPH_MAX_CONNECTION_POOL_SIZE=20
export DDIGRAPH_CHUNK_SIZE=1000
export DDIGRAPH_QUEUE_MAXSIZE=4
export DDIGRAPH_WRITER_CONCURRENCY=4
export DDIGRAPH_TRANSACTION_TIMEOUT=60
ddigraph bootstrap
ddigraph load large_codebook.xml --dataset-id survey2024 --batch-metrics
Large DDI-L FragmentInstance (5K+ fragments)¶
export DDIGRAPH_NEO4J_URI=bolt://cluster:7687
export DDIGRAPH_CHUNK_SIZE=500
export DDIGRAPH_TRANSACTION_TIMEOUT=60
export DDIGRAPH_WRITE_RETRY_ATTEMPTS=5
ddigraph bootstrap
ddigraph load large_questionnaire.xml --batch-metrics --json
AuraDB Cloud Instance¶
export DDIGRAPH_NEO4J_URI=neo4j+s://xxxx.databases.neo4j.io
export DDIGRAPH_ENCRYPTED=true
export DDIGRAPH_CONNECTION_TIMEOUT=10
export DDIGRAPH_CHUNK_SIZE=200 # Smaller batches for cloud latency
export DDIGRAPH_WRITE_RETRY_ATTEMPTS=5
export DDIGRAPH_WRITE_RETRY_BASE_DELAY=2.0
ddigraph bootstrap
ddigraph load file.xml --dataset-id demo
Memory-Constrained Environment¶
export DDIGRAPH_CHUNK_SIZE=50
export DDIGRAPH_QUEUE_MAXSIZE=1
export DDIGRAPH_WRITER_CONCURRENCY=1
ddigraph load file.xml --dataset-id demo
Performance Comparison¶
DDI-L FragmentInstance Performance¶
For Ireland_LabourSurvey.xml (148K lines, 2,762 fragments):
| Metric | Value |
|---|---|
| Neo4j queries | ~30 |
| Memory pattern | O(chunk_size) |
| Async operations | All writes |
| Queries per fragment | ~0.01 |
The low query count comes from UNWIND batching by fragment type.
Troubleshooting¶
Slow Ingestion¶
- Increase
chunk_sizeto reduce transaction overhead - Increase
writer_concurrency(Codebook) if pool has capacity - Check Neo4j server resources and indexes
Memory Issues¶
- Decrease
chunk_sizeto reduce in-flight data - Decrease
queue_maxsize(Codebook) to limit buffering - Ensure XML parsing uses streaming (
iterparse)
Connection Errors¶
- Increase
connection_timeoutfor slow networks - Increase
write_retry_attemptsfor transient failures - Verify TLS settings for cloud instances
Transaction Timeouts¶
- Decrease
chunk_sizeto reduce per-transaction work - Increase
transaction_timeoutfor large batches - Check Neo4j server configuration
See Architecture for design context and CLI Reference for all options.