TY - JOUR
T1 - Clusters of people with type 2 diabetes in the general population
T2 - Unsupervised machine learning approach using national surveys in Latin America and the Caribbean
AU - Carrillo-Larco, Rodrigo M.
AU - Castillo-Cara, Manuel
AU - Anza-Ramirez, Cecilia
AU - Bernabé-Ortiz, Antonio
N1 - Publisher Copyright:
© 2021 BMJ Publishing Group. All rights reserved.
PY - 2021/1/29
Y1 - 2021/1/29
N2 - Introduction We aimed to identify clusters of people with type 2 diabetes mellitus (T2DM) and to assess whether the frequency of these clusters was consistent across selected countries in Latin America and the Caribbean (LAC). Research design and methods We analyzed 13 population-based national surveys in nine countries (n=8361). We used k-means to develop a clustering model; predictors were age, sex, body mass index (BMI), waist circumference (WC), systolic/diastolic blood pressure (SBP/DBP), and T2DM family history. The training data set included all surveys, and the clusters were then predicted in each country-year data set. We used Euclidean distance, elbow and silhouette plots to select the optimal number of clusters and described each cluster according to the underlying predictors (mean and proportions). Results The optimal number of clusters was 4. Cluster 0 grouped more men and those with the highest mean SBP/DBP. Cluster 1 had the highest mean BMI and WC, as well as the largest proportion of T2DM family history. We observed the smallest values of all predictors in cluster 2. Cluster 3 had the highest mean age. When we reflected the four clusters in each country-year data set, a different distribution was observed. For example, cluster 3 was the most frequent in the training data set, and so it was in 7 out of 13 other country-year data sets. Conclusions Using unsupervised machine learning algorithms, it was possible to cluster people with T2DM from the general population in LAC; clusters showed unique profiles that could be used to identify the underlying characteristics of the T2DM population in LAC.
AB - Introduction We aimed to identify clusters of people with type 2 diabetes mellitus (T2DM) and to assess whether the frequency of these clusters was consistent across selected countries in Latin America and the Caribbean (LAC). Research design and methods We analyzed 13 population-based national surveys in nine countries (n=8361). We used k-means to develop a clustering model; predictors were age, sex, body mass index (BMI), waist circumference (WC), systolic/diastolic blood pressure (SBP/DBP), and T2DM family history. The training data set included all surveys, and the clusters were then predicted in each country-year data set. We used Euclidean distance, elbow and silhouette plots to select the optimal number of clusters and described each cluster according to the underlying predictors (mean and proportions). Results The optimal number of clusters was 4. Cluster 0 grouped more men and those with the highest mean SBP/DBP. Cluster 1 had the highest mean BMI and WC, as well as the largest proportion of T2DM family history. We observed the smallest values of all predictors in cluster 2. Cluster 3 had the highest mean age. When we reflected the four clusters in each country-year data set, a different distribution was observed. For example, cluster 3 was the most frequent in the training data set, and so it was in 7 out of 13 other country-year data sets. Conclusions Using unsupervised machine learning algorithms, it was possible to cluster people with T2DM from the general population in LAC; clusters showed unique profiles that could be used to identify the underlying characteristics of the T2DM population in LAC.
KW - adult
KW - developing countries
KW - diabetes mellitus
KW - type 2
UR - http://www.scopus.com/inward/record.url?scp=85100603629&partnerID=8YFLogxK
U2 - 10.1136/bmjdrc-2020-001889
DO - 10.1136/bmjdrc-2020-001889
M3 - Artículo de revisión
C2 - 33514531
AN - SCOPUS:85100603629
SN - 2052-4897
VL - 9
JO - BMJ Open Diabetes Research and Care
JF - BMJ Open Diabetes Research and Care
IS - 1
M1 - e001889
ER -