Synthetic Healthcare Datasets

This page demonstrates the structure of synthetic datasets available in this category, including variable schema, preview data, and reproducibility using the Syntherx SDK.

Indexed datasets in this category

Example Dataset

Example Synthetic Diabetes Cohort Dataset Schema

A synthetic diabetes cohort dataset designed for machine learning, research, and healthcare data experimentation.

Variable Schema

Loading schema...

Data Preview

First 9 rows (preview only)

patient_idagesextreatment_groupbaseline_measureoutcome_measure
P00000171Femaletreatment8.36.8
P00000243Malecontrol97.1
P00000359Femaletreatment8.87.5
P00000467Malecontrol9.17.5
P00000544Malecontrol8.97
P00000665Maletreatment7.96.9
P00000751Femaletreatment9.68.3
P00000853Malecontrol8.78
P00000951Femaletreatment8.67.4

Reproduce This Dataset

Recreate this dataset in Python (Jupyter, Kaggle, or Google Colab) using the Syntherx SDK.

# Install Syntherx SDK
pip install syntherx

from syntherx import generate_dataset

df = generate_dataset(
    blueprint="diabetes_hba1c_trial",
    rows=5000
)

df.to_csv("diabetes_hba1c_trial.csv")

Use Cases

  • Machine learning training
  • Healthcare research
  • Synthetic data experimentation

Privacy-Safe Synthetic Dataset

  • Contains no real patient data
  • Generated using statistical simulation
  • Designed for machine learning research

Unlock the Syntherx Platform

Generate custom datasets tailored to your research and AI needs.

Generate custom datasets