A Comparative Analysis of Machine Learning Models for Obesity Prediction
|
Keywords:
Obesity Prediction, Machine Learning Models, Ensemble Techniques, LightGBM, Healthcare Analytics
AbstractObesity is a global health challenge with significant implications for public health systems and individual well-being. Predictive modeling using machine learning (ML) offers a powerful approach to identify individuals at risk of obesity and inform early intervention strategies. This study evaluates the performance of ten ML models, including Logistic Regression, Support Vector Machines, Decision Trees, K-Nearest Neighbors, Naive Bayes, Random Forest, Gradient Boosting, AdaBoost, XGBoost, and LightGBM, in predicting obesity using a publicly available dataset. A rigorous preprocessing pipeline, incorporating missing value handling, categorical encoding, normalization, and outlier detection, was applied to ensure data quality and compatibility with ML algorithms. Performance metrics such as accuracy, precision, recall, and F1-score were evaluated using 10-fold stratified cross-validation. Among the models, LightGBM demonstrated the highest test accuracy (99.19%) and F1-score (99.20%), outperforming Gradient Boosting and Random Forest, which also showed competitive results. The study highlights the superior predictive capabilities of ensemble methods while underscoring the trade-offs between model complexity and interpretability. Logistic Regression provided a strong baseline, demonstrating the importance of preprocessing, but was outperformed by advanced ensemble techniques. This research contributes to the growing field of ML-driven healthcare solutions, offering valuable insights into the strengths and limitations of various predictive models. The findings support the integration of advanced ML techniques in public health systems and pave the way for future research on hybrid and explainable models for obesity prediction and management.Downloads
Download data is not yet available.
ReferencesAnekwe, C. V., Jarrell, A. R., Townsend, M. J., Gaudier, G. I., Hiserodt, J. M., & Stanford, F. C. (2020). Socioeconomics of obesity. Current Obesity Reports, 9, 272–279. https://doi.org/10.1007/s13679-020-00398-7 Buoncristiano, M., Williams, J., Simmonds, P., Nurk, E., Ahrens, W., Nardone, P., Rito, A. I., Rutter, H., Bergh, I. H., & Starc, G. (2021). Socioeconomic inequalities in overweight and obesity among 6- to 9-year-old children in 24 countries from the World Health Organization European region. Obesity Reviews, 22, e13213. https://doi.org/10.1111/obr.13213 Ryan, D., Barquera, S., Barata Cavalcanti, O., & Ralston, J. (2021). The global pandemic of overweight and obesity: Addressing a twenty-first century multifactorial disease. In Handbook of Global Health (pp. 739–773). Springer. https://doi.org/10.1007/978-3-030-45009-0_39 Goel, A., Reddy, S., & Goel, P. (2024). Causes, consequences, and preventive strategies for childhood obesity: A narrative review. Cureus, 16(7), e64985. https://doi.org/10.7759/cureus.64985 Javaid, M., Haleem, A., Singh, R. P., Suman, R., & Rab, S. (2022). Significance of machine learning in healthcare: Features, pillars and applications. International Journal of Intelligent Networks, 3, 58–73. https://doi.org/10.1016/j.ijin.2022.05.002 Sarma, S., Sockalingam, S., & Dash, S. (2021). Obesity as a multisystem disease: Trends in obesity rates and obesity-related complications. Diabetes, Obesity and Metabolism, 23, 3–16. https://doi.org/10.1111/dom.14290 Kepper, M. M., Walsh-Bailey, C., Brownson, R. C., Kwan, B. M., Morrato, E. H., Garbutt, J., de Las Fuentes, L., Glasgow, R. E., Lopetegui, M. A., & Foraker, R. (2021). Development of a health information technology tool for behavior change to address obesity and prevent chronic disease among adolescents: Designing for dissemination and sustainment using the ORBIT model. Frontiers in Digital Health, 3, 648777. https://doi.org/10.3389/fdgth.2021.648777 Hoffman, R. K., Donze, L. F., Agurs-Collins, T., Belay, B., Berrigan, D., Blanck, H. M., Brandau, A., Chue, A., Czajkowski, S., & Dillon, G. (2024). Adult obesity treatment and prevention: A trans-agency commentary on the research landscape, gaps, and future opportunities. Obesity Reviews, e13769. https://doi.org/10.1111/obr.13769 Ayub, H., Khan, M.-A., Naqvi, S. S. A., Faseeh, M., Kim, J., Mehmood, A., & Kim, Y.-J. (2024). Unraveling the potential of attentive Bi-LSTM for accurate obesity prognosis: Advancing public health towards sustainable cities. Bioengineering, 11(6), 533. https://doi.org/10.3390/bioengineering11060533 Siddiqui, H., Rattani, A., Woods, N. K., Cure, L., Lewis, R. K., Twomey, J., Smith-Campbell, B., & Hill, T. J. (2021). A survey on machine and deep learning models for childhood and adolescent obesity. IEEE Access, 9, 157337–157360. https://doi.org/10.1109/ACCESS.2021.3131128 Shakti, M. A. S., Vijayalakshmi, M., Kumar, N., & Vaidhehi, M. (2024). Analysis on various machine learning framework for obesity level prediction. In 2024 1st International Conference on Cognitive, Green and Ubiquitous Computing (IC-CGU) (pp. 1–6). IEEE. https://doi.org/10.1109/IC-CGU58078.2024.10530812 Gozukara Bag, H. G., Yagin, F. H., Gormez, Y., González, P. P., Colak, C., Gülü, M., Badicu, G., & Ardigò, L. P. (2023). Estimation of obesity levels through the proposed predictive approach based on physical activity and nutritional habits. Diagnostics, 13(18), 2949. https://doi.org/10.3390/diagnostics13182949 Castro, C., Leiva, V., Lourenço-Gomes, M. C., & Amorim, A. P. (2023). Advanced mathematical approaches in psycholinguistic data analysis: A methodological insight. Fractal and Fractional, 7(9), 670. https://doi.org/10.3390/fractalfract7090670 Asif, S., Yi, W., ur-Rehman, S., ul-ain, Q., Amjad, K., Yi, Y., Si, J., & Awais, M. (2024). Advancements and prospects of machine learning in medical diagnostics: Unveiling the future of diagnostic precision. Archives of Computational Methods in Engineering, 1–31. https://doi.org/ 10.1007/s11831-024-10148-w Ganie, S. M., Reddy, B. B., Hemachandran, K., & Rege, M. (2024). An investigation of ensemble learning techniques for obesity risk prediction using lifestyle data. Decision Analytics Journal, 5, 100539. https://doi.org/10.1016/j.dajour.2024.100539 Ferdowsy, F., Rahi, K. S. A., Jabiullah, M. I., & Habib, M. T. (2021). A machine learning approach for obesity risk prediction. Current Research in Behavioral Sciences, 2, 100053. https://doi.org/10.1016/j.crbeha.2021.100053 Verma, O. P., Verma, S., & Perumal, T. (2024). Advancement of intelligent computational methods and technologies. CRC Press. An, R., Shen, J., & Xiao, Y. (2022). Applications of artificial intelligence to obesity research: Scoping review of methodologies. Journal of Medical Internet Research, 24(12), e40589. https://doi.org/10.2196/40589 Wu, Y., Li, D., & Vermund, S. H. (2024). Advantages and limitations of the body mass index (BMI) to assess adult obesity. International Journal of Environmental Research and Public Health, 21(6), 757. https://doi.org/10.3390/ijerph21060757 Palechor, F. M., & De la Hoz Manotas, A. (2019). Dataset for estimation of obesity levels based on eating habits and physical condition in individuals from Colombia, Peru and Mexico. Data in Brief, 25, 104344. https://doi.org/10.1016/j.dib.2019.104344 |
Published
2025-03-31
Section
Articles
How to Cite
Airlangga, G. (2025). A Comparative Analysis of Machine Learning Models for Obesity Prediction. Jurnal Informatika Ekonomi Bisnis, 7(1), 1-5. https://doi.org/10.37034/infeb.v7i1.1089
![]() This work is licensed under a Creative Commons Attribution 4.0 International License. |


















