Synthetic EHR Datasets

This page demonstrates the structure of synthetic datasets available in this category, including variable schema, preview data, and reproducibility using the Syntherx SDK.

Example Dataset

Ehr

Variable schema and example preview from the blueprint definition.

Variable Schema

Column NameTypeDescription
patient_idstringUnique patient identifier
visit_datestringDate of visit
agenumberPatient age
sexstringPatient sex
diagnosisstringPrimary diagnosis
medicationstringPrescribed medication
lab_result_typestringType of lab test
lab_result_valuenumberLab result value

Data Preview

First 10 rows (preview only)

patient_idvisit_dateagesexdiagnosismedicationlab_result_typelab_result_value
P0000012024-01-1065FemaleType 2 DiabetesMetforminHbA1c8.2
P0000012024-03-1565FemaleType 2 DiabetesMetforminHbA1c7.5
P0000022024-02-2058MaleHypertensionLisinoprilBlood Pressure140
P0000032024-01-0572FemaleCOPDAlbuterolOxygen Saturation92
P0000042024-02-1260MaleHyperlipidemiaAtorvastatinLDL155
P0000052024-03-0850FemaleType 2 DiabetesInsulinGlucose180
P0000062024-01-2572MaleHeart FailureFurosemideBNP450
P0000072024-04-0245FemaleAsthmaAlbuterolPeak Flow350
P0000082024-02-2867MaleChronic Kidney DiseaseLosartanCreatinine2.1
P0000092024-04-1063FemaleType 2 DiabetesMetforminHbA1c7.2

Includes CSV, JSON, and Parquet — ready for ML pipelines

Reproduce This Dataset

Recreate this dataset in Python (Jupyter, Kaggle, or Google Colab) using the Syntherx SDK

# Install Syntherx SDK
pip install syntherx

from syntherx import generate_dataset

df = generate_dataset(
    blueprint="ehr_longitudinal",
    rows=5000
)

df.to_csv("ehr_longitudinal.csv")

Use Cases

  • Longitudinal patient trajectory modeling
  • Disease progression analysis
  • Clinical decision support simulations

Privacy-Safe Synthetic Dataset

  • Contains no real patient data
  • Generated using statistical simulation
  • Designed for machine learning research

Purchase Dataset

Research Dataset — $99

Secure checkout via Stripe.

Includes CSV, JSON, and Parquet — ready for ML pipelines

Unlock the Syntherx Platform

Generate custom datasets tailored to your research and AI needs.

Generate custom datasets
← Back to all datasets