Synthetic Diabetes Datasets

This page demonstrates the structure of synthetic datasets available in this category, including variable schema, preview data, and reproducibility using the Syntherx SDK.

Example Dataset

Diabetes Hba1c Trial

Variable schema and example preview from the blueprint definition.

Variable Schema

Column Name	Type	Description
patient_id	string	Unique synthetic patient identifier
age	number	Synthetic patient age
sex	string	Synthetic patient sex
treatment_group	string	Treatment or control group
baseline_measure	number	Baseline HbA1c measure
outcome_measure	number	Outcome HbA1c measure

Data Preview

First 9 rows (preview only)

patient_id	age	sex	treatment_group	baseline_measure	outcome_measure
P000001	71	Female	treatment	8.3	6.8
P000002	43	Male	control	9	7.1
P000003	59	Female	treatment	8.8	7.5
P000004	67	Male	control	9.1	7.5
P000005	44	Male	control	8.9	7
P000006	65	Male	treatment	7.9	6.9
P000007	51	Female	treatment	9.6	8.3
P000008	53	Male	control	8.7	8
P000009	51	Female	treatment	8.6	7.4

Includes CSV, JSON, and Parquet — ready for ML pipelines

Reproduce This Dataset

Recreate this dataset in Python (Jupyter, Kaggle, or Google Colab) using the Syntherx SDK

# Install Syntherx SDK
pip install syntherx

from syntherx import generate_dataset

df = generate_dataset(
    blueprint="diabetes_hba1c_trial",
    rows=5000
)

df.to_csv("diabetes_hba1c_trial.csv")

Use Cases

Diabetes and glycemic outcomes modeling
Glucose and treatment response analysis
Risk stratification for diabetes populations

Privacy-Safe Synthetic Dataset

Contains no real patient data
Generated using statistical simulation
Designed for machine learning research

Purchase Dataset

Research Dataset — $99

Secure checkout via Stripe.

Includes CSV, JSON, and Parquet — ready for ML pipelines

Unlock the Syntherx Platform

Generate custom datasets tailored to your research and AI needs.

Generate custom datasets

← Back to all datasets