Synthetic CDISC SDTM data engine

A whole conformant study, generated from the standard.

Verdatic turns CDISC's published standard — or your own study artifacts — into realistic, coherent synthetic SDTM data with ready-to-run derivation code. Point it at a schedule and a whole Phase 2/3 study comes out: every structural domain, conformant and coherent. Sponsor data optional.

See it generate a study → How it works No real patient data required.
100%
CT-valid on generation
100%
CORE-conformant on SDTM long
10,051
SDTM records · 7 domains · one call
5
SDTM structural classes proven
The integrated loop

Not one tool. The whole chain — and it compounds.

Most tools do a single slice: a data generator, a mapping helper, or a CDISC library. Verdatic chains them into one loop — and learns from every coverage gap, so it maps more next time.

01

Ingest

A Rave ALS build or USDM protocol — or nothing but the CDISC standard.

02

Auto-map

Fields link to Biomedical Concepts & specializations, confidence-tiered.

03

Simulate

Coherent data that obeys the study's own rules across fields, visits & subjects.

04

Generate code

Runnable SDTM derivation in SAS, SQL or Python.

05

Learn

Unmatched fields become authoring targets — coverage grows with use.

↻  Every gap filled is a permanent, compounding upgrade to auto-mapping coverage.
Capabilities

Realism and conformance, by construction.

The constraints come from real edit-checks and the published standard — not hardcoded clinical assumptions. That's what makes the data defensible.

Multi-axial coherence

Generated values stay consistent across fields, visits and subjects — because constraints are mined from the study's actual quality rules. Episodic events keep their identity across visits; con-meds couple to the subject's own history.

Patent #1

Closed-loop optimizer

A violation-feedback loop drives a synthetic dataset to pass the study's conformance rules — from 185% violations to zero — with no manual parameter-fiddling. Convergence becomes a headline trust metric.

Patent #2

Standards-driven generation

No sponsor study, no real data. Point Verdatic at the CDISC catalog and a schedule; one call yields DM, VS/LB/EG, MH/AE/CM and TV/TA — every SDTM structural class, coherent and conformant.

Sponsor removed

Auto-mapping to CDISC

A confidence-tiered matcher links source fields to the right Biomedical Concept and specialization — value-level-metadata and specimen aware. High-confidence auto-confirms; the rest are one click.

Confidence-scored

Runnable transform code

Compiles the mapping into real SDTM derivation code — SAS, SQL or Python — with method-position orchestration and library-function preludes. The authored logic emits to multiple targets.

SAS · SQL · Python

The coverage flywheel

Real usage surfaces which fields can't auto-map; those become the next authoring targets. Coverage is the moat — and it grows automatically with every project you run.

Compounds with use
See it run

One call. A whole study.

Hand Verdatic a USDM schedule and let the standard do the rest. ScaffoldStudyAsync produces every SDTM structural class — generated, longitudinally coherent, cross-domain coherent, and transposed to SDTM long.

One subject roster. Findings in range. Events MedDRA/WHODrug-coded. Concomitant meds coupled to each subject's own medical history. Conformance measured, not assumed.

verdatic · scaffold
══ BROAD PHASE 2/3 STUDY (standards-driven, sponsor removed) ══
  100 subjects (2 sites) · 5 visits (Screening D-14 … EOT D168)
  DM    100   (one per subject)
  VS    500  →  2000 SDTM long records
  LB    500  →  2500 SDTM long records
  EG    500  →  2500 SDTM long records
  MH    929    AE  486    CM  1536   (events)
   10,051 SDTM records across the study
  CM coupled to subject's own MH: 1050/1536 (68%)
══ one roster · findings in-range · events coded · CT-valid ══
──────────────────────────────────────────────
  CT validity ........................ 100%
  CORE conformance (SDTM long) ....... 100%  (46 rules)
  Optimizer ........... violations 185% → 0%
Who it's for

Build and test the pipeline before the data exists.

Conformant synthetic data for development, validation, training and demonstration — so teams aren't blocked waiting on real, locked, or restricted study data.

SDTM programmers

Exercise mapping & derivation code against realistic, conformant data — including deliberate dirty-data to hit the right edit checks.

Clinical data managers

Stand up a representative study and its edit-check behavior without exposing a single real subject.

Biostatisticians

Coherent longitudinal trajectories and cross-domain coupling — distributions that behave, for method and pipeline testing.

Standards & tooling teams

A conformant test-bed to validate CORE rules, explore study designs, and pressure-test CDISC tooling at scale.

Point it at the standard. Get a study back.

Verdatic is in pre-launch. Request early access and we'll show you a conformant study generated from nothing but a schedule.