Diabetes Hba1c Trial
Synthetic patient-level rows with fields: patient_id, age, sex, treatment_group, baseline_measure, outcome_measure.
This resource represents a fully synthetic cohort patterned after healthcare scenarios: there are no real patients or protected health information, only statistically plausible records for method development and reproducible benchmarks.
Rows include variables such as patient_id, age, sex, treatment_group, baseline_measure, outcome_measure. You can inspect the full schema and representative preview below before downloading or generating a fresh cohort with the Syntherx SDK.
Teams use datasets like this for AI and statistical modeling, digital twin and pathway simulation, curriculum and sandbox environments, and cross-institutional collaborations where sharing real data is impractical.
Research Dataset — $99
Secure checkout via Stripe.
Includes CSV, JSON, and Parquet — ready for ML pipelines
Variable Schema
| Column Name | Type | Description |
|---|---|---|
| patient_id | string | Unique synthetic patient identifier |
| age | number | Synthetic patient age |
| sex | string | Synthetic patient sex |
| treatment_group | string | Treatment or control group |
| baseline_measure | number | Baseline HbA1c measure |
| outcome_measure | number | Outcome HbA1c measure |
Data Preview
First 9 rows (preview only)
Includes CSV, JSON, and Parquet — ready for ML pipelines
| patient_id | age | sex | treatment_group | baseline_measure | outcome_measure |
|---|---|---|---|---|---|
| P000001 | 71 | Female | treatment | 8.3 | 6.8 |
| P000002 | 43 | Male | control | 9 | 7.1 |
| P000003 | 59 | Female | treatment | 8.8 | 7.5 |
| P000004 | 67 | Male | control | 9.1 | 7.5 |
| P000005 | 44 | Male | control | 8.9 | 7 |
| P000006 | 65 | Male | treatment | 7.9 | 6.9 |
| P000007 | 51 | Female | treatment | 9.6 | 8.3 |
| P000008 | 53 | Male | control | 8.7 | 8 |
| P000009 | 51 | Female | treatment | 8.6 | 7.4 |
Reproduce This Dataset
Recreate this dataset in Python (Jupyter, Kaggle, or Google Colab) using the Syntherx SDK.
# Install Syntherx SDK
pip install syntherx
from syntherx import generate_dataset
df = generate_dataset(
blueprint="diabetes_hba1c_trial",
rows=5000
)
df.to_csv("diabetes_hba1c_trial.csv")Use Cases
- Build and validate AI/ML pipelines for Healthcare scenarios without using real patient data.
- Train and evaluate models on structured fields such as patient_id, age, sex, treatment_group.
- Run simulations, power analyses, and exploratory analytics in a privacy-safe sandbox.
- Prototype dashboards, ETL flows, and feature stores before touching production systems.
Dataset Characteristics
- Fully synthetic — no PHI; suitable for sharing, teaching, and external collaboration.
- Schema includes 6 variables: patient_id, age, sex, treatment_group, baseline_measure, outcome_measure
- Delivered in researcher-friendly formats (CSV, JSON, Parquet) for downstream tooling.
- Generated with the Syntherx simulation engine for reproducible cohort-scale draws.
Privacy-Safe Synthetic Dataset
- Contains no real patient data
- Generated using statistical simulation
- Designed for machine learning research
Related Datasets
Explore adjacent synthetic cohorts in the same domain or browse nearby clinical themes.
- Cardiology OutcomesSynthetic patient-level cardiovascular risk factors and biomarkers for ML and outcomes research.
- ClaimsSynthetic patient-level rows with fields: patient_id, age, sex, diagnosis_code, procedure_code, claim_amount, ….
- ClaimsSynthetic patient-level rows with fields: patient_id, age, sex, diagnosis_code, procedure_code, claim_amount, ….
- Clinical Trial OutcomesSynthetic patient-level rows with fields: patient_id, age, sex, trial_arm, baseline_value, endpoint_value, ….
- EhrSynthetic patient-level rows with fields: patient_id, visit_date, age, sex, diagnosis, medication, ….
Unlock the Syntherx Platform
Generate custom datasets tailored to your research and AI needs.
Generate Custom Datasets