Data-Driven Classification of Poverty Status in Indonesia using Machine Learning Techniques

Authors

  • Syaila Fathia Azzahra Universitas Pendidikan Indonesia
  • Yudi Ahmad Hambali
  • Ismail Marzuki Randos

DOI:

https://doi.org/10.61179/infact.v10i01.753

Keywords:

Poverty Classification, K-Nearest Neighbor, Socio-Economic Indicators, Machine Learning Indonesia

Abstract

This study explores the use of the K-Nearest Neighbor (KNN) algorithm to classify poverty status in Indonesia using publicly available socio-economic indicators. Traditional poverty classification methods are often inefficient and lack nuance. By leveraging the Knowledge Discovery in Databases (KDD) process, including data preprocessing, normalization, and dimensionality reduction via PCA, the study builds a robust classification model. The dataset includes indicators such as education, health, and expenditure levels from 514 districts/cities. The optimal KNN model, determined through cross-validation, achieved a test accuracy of 95.15%, with strong precision, recall, and ROC AUC scores. Feature importance analysis via Random Forest on PCA-transformed data highlights the predictive influence of certain component combinations. The results demonstrate the potential of machine learning to support more accurate and data-driven policy targeting in poverty alleviation. Future enhancements may involve integrating time-series or satellite data to increase relevance and precision.

 

References

P. R. Sihombing and A. M. Arsani, “Comparison of Machine Learning Methods in Classifying Poverty in Indonesia in 2018,” J. Tek. Inform. JUTIF, vol. 2, no. 1, pp. 51–56, 2021. [Online]. Available: https://doi.org/10.20884/1.jutif.2021.2.1.52

D. B. Lasfeto, T. Setyorini, J. J. Mauta et al., “A simple classification and clustering of poverty in rural areas using machine learning,” J. Infrastruct. Policy Dev., vol. 8, no. 8, pp. 5938, 2024. [Online]. Available: https://doi.org/10.24294/jipd.v8i8.5938

H. Purnomo and D. Nurhadi, “Penerapan Algoritma K-Nearest Neighbor untuk Klasifikasi Kelayakan Status Penduduk Miskin di Desa Susukan Tonggoh,” J. Informatika, vol. 10, no. 1, pp. 29–36, 2022. [Online]. Available: https://journal.stmikjabar.ac.id/index.php/i/article/view/29

D. P. Sari and A. Wibowo, “Machine Learning for Clustering Regencies-Cities Based on Inflation and Poverty Rates in Indonesia,” Indones. J. Inf. Syst., vol. 5, no. 1, pp. 64–73, 2021. [Online]. Available: https://doi.org/10.24002/ijis.v5i1.5682

R. A. Putra and B. Santoso, “Classification of the Poor in Sumatera and Java using Naive Bayes and Particle Swarm Optimization,” J. Riset Inform., vol. 7, no. 2, pp. 45–54, 2022. [Online]. Available: http://journal.kresnamediapublisher.com/index.php/jri/article/view/164

M. Akbar and A. Kusumodestoni, “Performance Improvement of K-Nearest Neighbor Algorithm in KIP Scholarship Classification,” J. Mantik, vol. 6, no. 1, pp. 30–35, 2020. [Online]. Available: https://iocscience.org/ejournal/index.php/mantik/article/download/2130/1669/6179

BPS, “Micro-Analysis of Household Poverty and Inequality in Indonesia,” 2023. [Online]. Available: https://al.unnes.ac.id/journals/jejak/article/download/9512/2955/59720

F. Feng, “Application of Data Mining Technology in Poverty Alleviation Prediction in Ethnic Areas,” in Proc. 4th Int. Conf. Informatization Econ. Dev. Manage., 2024. [Online]. Available: https://doi.org/10.4108/eai.2322024.2345877

C. Mustika and R. Nurjanah, “Rural and Urban Poverty Models on Sumatra Island,” J. Perspektif Pembiayaan Pembang. Daerah, vol. 9, no. 1, pp. 107–114, 2021. [Online]. Available: https://doi.org/10.22437/ppd.v9i1.10684

R. T. Vulandari, “Development of Data Mining Software Using Association Techniques Based on Apriori Algorithm Method,” J. Inf. Syst. Informatics Comput., vol. 6, no. 1, pp. 125–136, 2017. [Online]. Available: https://doi.org/10.52362/jisicom.v6i1.80

C. Yeh, A. Perez, A. Driscoll et al., “Using publicly available satellite imagery and deep learning to understand economic well-being in Africa,” Nat. Commun., vol. 11, no. 1, p. 2583, 2020. [Online]. Available: https://doi.org/10.1038/s41467-020-16185-w

M. Engler, M. Kasy, and C. Leaver, “Machine learning for public policy: Principles and practice,” CEPR Discussion Paper No. DP16267, 2021. [Online]. Available: https://cepr.org/publications/dp16267.

Downloads

Published

2026-05-07

How to Cite

[1]
Syaila Fathia Azzahra, Yudi Ahmad Hambali, and Ismail Marzuki Randos, “Data-Driven Classification of Poverty Status in Indonesia using Machine Learning Techniques”, IIJC, vol. 10, no. 01, pp. 1–7, May 2026.