PCA-counseled k-means and k-medoids with dimension reduction for improved in determining optimal aid clustering
DOI:
https://doi.org/10.21107/kursor.v13i1.460Keywords:
Aid data, Clustering, K-Means, K-Medoids, Principal Component AnalysisAbstract
Assuring effective allocation requires targeted distribution of aid, which makes aid clustering a crucial component. For the purpose of using data-driven segmentation based on important characteristics to determine effective help targeting, accuracy in clustering is essential. The study explores the combination of Principal ComponentAnalysis (PCA), k-means, and k-medoids to enhance aid clusters, with the goal ofincreasing aid distribution accuracy and efficiency. The information gathered consists of 1600 records with 13 attributes. In order to standardized the data having two processes in it, preprocessing is first applied. When used with PCA, it makes measuring variance easier and preserves 80% of the variation by choosing five components. Thenumber of clusters may be determined with the use of PCA, k-medoids, and the k-means approach. Greater PCA-k-means silhouette coefficients, which indicate betterclustering ability, are highlighted by comparative analysis. This analysis shows thatPCA-k-means is an effective technique for creating accurate and unique clusters withina data set's structure.The clustering results using the PCA-k-means algorithm have produced the greatest accuracy in the silhouette score of 0.49 and the DBI score is 0.84.
Downloads
References
[1] C. Apodaca, “Foreign Aid as Foreign Policy Tool,” 2017, Oxford University Press. doi: 10.1093/acrefore/9780190228637.013.332.
[2] I. Ullah, B. Raza, A. K. Malik, M. Imran, S. U. Islam, and S. W. Kim, “A Churn Prediction Model Using Random Forest: Analysis of Machine Learning Techniques for Churn Prediction and Factor Identification in Telecom Sector,” IEEE Access, vol. 7, pp. 60134–60149, 2019, doi: 10.1109/ACCESS.2019.2914999.
[3] H. Sun, H. V Burton, and H. Huang, “Machine learning applications for building structural design and performance assessment: State-of-the-art review,” Journal of Building Engineering, vol. 33, p. 101816, 2021, doi: https://doi.org/10.1016/j.jobe.2020.101816.
[4] A. E. Ezugwu et al., “A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects,” Eng Appl Artif Intell, vol. 110, p. 104743, 2022.
[5] G. J. Oyewole and G. A. Thopil, “Data clustering: application and trends,” Artif Intell Rev, vol. 56, no. 7, pp. 6439–6475, 2023.
[6] B. Lund and J. Ma, “A review of cluster analysis techniques and their uses in library and information science research: k-means and k-medoids clustering,” Performance Measurement and Metrics, vol. 22, no. 3, pp. 161–173, 2021.
[7] W. Utomo, “The comparison of k-means and k-medoids algorithms for clustering the spread of the covid-19 outbreak in Indonesia,” ILKOM Jurnal Ilmiah, vol. 13, no. 1, pp. 31–35, 2021.
[8] Öm. N. Kenger, Z. D. Kenger, E. Özceylan, and B. Mrugalska, “Clustering of cities based on their smart performances: a comparative approach of fuzzy c-means, k-means, and k-medoids,” IEEE Access, vol. 11, pp. 134446–134459, 2023.
[9] C. Zhu, C. U. Idemudia, and W. Feng, “Improved logistic regression model for diabetes prediction by integrating PCA and K-means techniques,” Inform Med Unlocked, vol. 17, Jan. 2019, doi: 10.1016/j.imu.2019.100179.
[10] “Cluster Structure of K-means Clustering via Principal Component Analysis”.
[11] O. Khouili, M. Hanine, and M. Louzazni, “Harnessing Principal Component Analysis and Artificial Neural Networks for Accurate Solar Radiation Prediction,” Int J Energy Res, vol. 2025, no. 1, p. 5846114, 2025.
[12] F. L. Gewers et al., “Principal component analysis: A natural approach to data exploration,” ACM Computing Surveys (CSUR), vol. 54, no. 4, pp. 1–34, 2021.
[13] E. Elhaik, “Principal Component Analyses (PCA)-based findings in population genetic studies are highly biased and must be reevaluated,” Sci Rep, vol. 12, no. 1, p. 14683, 2022.
[14] Istina Alya Rosyada, Dina Tri Utari,"Penerapan Principal Component Analysis untuk Reduksi Variabel pada Algoritma K-Means Clustering", Jambura Journal of Probability and Statistics, vol. 5, no. 1, p. 6–13, 2024.
[15] S. Galli, Python feature engineering cookbook. Packt Publishing Ltd, 2024.
[16] M. Shantal, Z. Othman, and A. A. Bakar, “A novel approach for data feature weighting using correlation coefficients and min–max normalization,” Symmetry (Basel), vol. 15, no. 12, p. 2185, 2023.
[17] A. Eka Haryati, R. Desia Arindra Putri, and A. Dahlan, “COMPARISON OF FUZZY SUBTRACTIVE CLUSTERING AND FUZZY C-MEANS,” Jurnal Ilmiah KURSOR, vol. 11, no. 1, pp. 1–8, 2021.
[18] H. Honda, P. Dinh, P. T. Thao, Y. Tabata, and B. D. Anh, “Dimensionality reduction as a non-cooperative game,” in 2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), 2023, pp. 72–77. doi: 10.1109/ICAIIC57133.2023.10067075.
[19] C. Zhu, C. U. Idemudia, and W. Feng, “Improved logistic regression model for diabetes prediction by integrating PCA and K-means techniques,” Inform Med Unlocked, vol. 17, Jan. 2019, doi: 10.1016/j.imu.2019.100179.
[20] I. M. S. Bimantara and I. M. Widiartha, “Optimization of K-Means Clustering Using Particle Swarm Optimization Algorithm for Grouping Traveler Reviews Data on Tripadvisor Sites,” Jurnal Ilmiah Kursor, vol. 12, no. 1, pp. 1–10, 2023.
[21] S. M. Miraftabzadeh, C. G. Colombo, M. Longo, and F. Foiadelli, “K-means and alternative clustering methods in modern power systems,” Ieee Access, vol. 11, pp. 119596–119633, 2023.
[22] I. B. Djordjevic, “Chapter 14 - Quantum Machine Learning,” in Quantum Information Processing, Quantum Computing, and Quantum Error Correction (Second Edition), I. B. Djordjevic, Ed., Academic Press, 2021, pp. 619–701. doi: https://doi.org/10.1016/B978-0-12-821982-9.00007-1.
[23] A. V Ushakov and I. Vasilyev, “Near-optimal large-scale k-medoids clustering,” Inf Sci (N Y), vol. 545, pp. 344–362, 2021.
[24] R. O. Sinnott, H. Duan, and Y. Sun, “Chapter 15 - A Case Study in Big Data Analytics: Exploring Twitter Sentiment Analysis and the Weather,” in Big Data, R. Buyya, R. N. Calheiros, and A. V. Dastjerdi, Eds., Morgan Kaufmann, 2016, pp. 357–388. doi: https://doi.org/10.1016/B978-0-12-805394-2.00015-5.
[25] S. Zheng and J. Zhao, “States Identification of Complex Chemical Process Based on Unsupervised Learning,” in Computer Aided Chemical Engineering, vol. 44, M. R. Eden, M. G. Ierapetritou, and G. P. Towler, Eds., Elsevier, 2018, pp. 2239–2244. doi: https://doi.org/10.1016/B978-0-444-64241-7.50368-2.
[26] S. Dahiya, H. Nanda, J. Artwani, and J. Varshney, “Using clustering techniques and classification mechanisms for fault diagnosis,” International Journal, vol. 9, no. 2, 2020.
[27] B. Surarso and R. Gernowo, “IMPLEMENTATION OF K-MEDOIDS CLUSTERING FOR HIGH EDUCATION ACCREDITATION DATA,” Jurnal Ilmiah KURSOR, vol. 10, no. 3, pp. 119–128, 2020.

Downloads
Published
Issue
Section
Citation Check
License
Copyright (c) 2025 Achmad Jauhari, Ika Oktavia Suzanti, Devie Rosa Anamisa, Fadhila Tangguh Admojo

This work is licensed under a Creative Commons Attribution 4.0 International License.