Synthetic EHR Datasets

This page demonstrates the structure of synthetic datasets available in this category, including variable schema, preview data, and reproducibility using the Syntherx SDK.

Example Dataset

Example EHR Schema

Schema and example preview for datasets in this category, fetched from the blueprint API.

Variable Schema

Column NameTypeDescription
patient_idstringUnique patient identifier
visit_datestringDate of visit
agenumberPatient age
sexstringPatient sex
diagnosisstringPrimary diagnosis
medicationstringPrescribed medication
lab_result_typestringType of lab test
lab_result_valuenumberLab result value

Data Preview

First 10 rows (preview only)

patient_idvisit_dateagesexdiagnosismedicationlab_result_typelab_result_value
P0000012024-01-1065FemaleType 2 DiabetesMetforminHbA1c8.2
P0000012024-03-1565FemaleType 2 DiabetesMetforminHbA1c7.5
P0000022024-02-2058MaleHypertensionLisinoprilBlood Pressure140
P0000032024-01-0572FemaleCOPDAlbuterolOxygen Saturation92
P0000042024-02-1260MaleHyperlipidemiaAtorvastatinLDL155
P0000052024-03-0850FemaleType 2 DiabetesInsulinGlucose180
P0000062024-01-2572MaleHeart FailureFurosemideBNP450
P0000072024-04-0245FemaleAsthmaAlbuterolPeak Flow350
P0000082024-02-2867MaleChronic Kidney DiseaseLosartanCreatinine2.1
P0000092024-04-1063FemaleType 2 DiabetesMetforminHbA1c7.2

Reproduce This Dataset

Recreate this longitudinal EHR dataset in Python (Jupyter, Kaggle, or Google Colab) using the Syntherx SDK.

# Install Syntherx SDK
pip install syntherx

from syntherx import generate_dataset

df = generate_dataset(
    blueprint="ehr_longitudinal",
    rows=5000
)

df.to_csv("ehr_longitudinal.csv")

Use Cases

  • Longitudinal patient trajectory modeling
  • Disease progression analysis
  • Clinical decision support simulations

Privacy-Safe Synthetic Dataset

  • Contains no real patient data
  • Generated using statistical simulation
  • Designed for machine learning research

No datasets in this category yet. Browse all datasets

Unlock the Syntherx Platform

Generate custom datasets tailored to your research and AI needs.

Generate custom datasets