Resampling Technique for Imbalanced Class Handling on Educational Dataset
Abstract
Keywords
References
[1] R. Ghorbani and R. Ghousi, “Comparing Different Resampling Methods in Predicting Students’ Performance Using Machine Learning Techniques,” IEEE Access, vol. 8, pp. 67899–67911, 2020, doi: 10.1109/ACCESS.2020.2986809.
[2] A. I. Adekitan and O. Salau, “The impact of engineering students’ performance in the first three years on their graduation result using educational data mining,” Heliyon, vol. 5, no. 2, p. e01250, 2019, doi: 10.1016/j.heliyon.2019.e01250.
[3] M. Utari, B. Warsito, and R. Kusumaningrum, “Implementation of Data Mining for Drop-Out Prediction using Random Forest Method,” 2020 8th Int. Conf. Inf. Commun. Technol. ICoICT 2020, 2020, doi: 10.1109/ICoICT49345.2020.9166276.
[4] P. Dabhade, R. Agarwal, K. P. Alameen, A. T. Fathima, R. Sridharan, and G. Gopakumar, “Educational data mining for predicting students’ academic performance using machine learning algorithms,” Mater. Today Proc., vol. 47, no. xxxx, pp. 5260–5267, 2021, doi: 10.1016/j.matpr.2021.05.646.
[5] M. Tsiakmaki, G. Kostopoulos, S. Kotsiantis, and O. Ragos, “Implementing AutoML in Educational Data Mining for Prediction Tasks,” Applied Sciences , vol. 10, no. 1. 2020. doi: 10.3390/app10010090.
[6] E. A. Amrieh, T. Hamtini, and I. Aljarah, “Mining Educational Data to Predict Student’s academic Performance using Ensemble Methods,” Int. J. Database Theory Appl., vol. 9, no. 8, pp. 119–136, 2016, doi: 10.14257/ijdta.2016.9.8.13.
[7] E. Buraimoh, R. Ajoodha, and K. Padayachee, “Importance of Data Re-Sampling and Dimensionality Reduction in Predicting Students’ Success,” 3rd Int. Conf. Electr. Commun. Comput. Eng. ICECCE 2021, no. June, pp. 12–13, 2021, doi: 10.1109/ICECCE52056.2021.9514123.
[8] G. Kovács, “Smote-variants: A python implementation of 85 minority oversampling techniques,” Neurocomputing, vol. 366, pp. 352–354, 2019, doi: 10.1016/j.neucom.2019.06.100.
[9] Y. Pristyanto, I. Pratama, and A. F. Nugraha, “Data level approach for imbalanced class handling on educational data mining multiclass classification,” in 2018 International Conference on Information and Communications Technology, ICOIACT 2018, 2018, vol. 2018-Janua. doi: 10.1109/ICOIACT.2018.8350792.
[10] V. S. Spelmen and R. Porkodi, “A Review on Handling Imbalanced Data,” Proc. 2018 Int. Conf. Curr. Trends Towar. Converging Technol. ICCTCT 2018, pp. 1–11, 2018, doi: 10.1109/ICCTCT.2018.8551020.
[11] K. Borowska and J. Stepaniuk, “Imbalanced data classification: A novel re-sampling approach combining versatile improved SMOTE and rough sets,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 9842 LNCS, no. 1, pp. 31–42, 2016, doi: 10.1007/978-3-319-45378-1_4.
[12] F. Sağlam and M. A. Cengiz, “A novel SMOTE-based resampling technique trough noise detection and the boosting procedure,” Expert Syst. Appl., vol. 200, no. April 2020, pp. 1–12, 2022, doi: 10.1016/j.eswa.2022.117023.
[13] G. E. A. P. A. Batista, R. C. Prati, and M. C. Monard, “A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data,” SIGKDD Explor. Newsl., vol. 6, no. 1, pp. 20–29, Jun. 2004, doi: 10.1145/1007730.1007735.
[14] H. He, Y. Bai, E. A. Garcia, and S. Li, “ADASYN: Adaptive synthetic sampling approach for imbalanced learning,” Proc. Int. Jt. Conf. Neural Networks, no. 3, pp. 1322–1328, 2008, doi: 10.1109/IJCNN.2008.4633969.
[15] E. P. W. Mandala, E. Rianti, and S. Defit, “Classification of Customer Loans Using Hybrid Data Mining,” JUITA J. Inform., vol. 10, no. 1, p. 45, 2022, doi: 10.30595/juita.v10i1.12521.
[16] M. R. Romadhon and F. Kurniawan, “A Comparison of Naive Bayes Methods, Logistic Regression and KNN for Predicting Healing of Covid-19 Patients in Indonesia,” 3rd 2021 East Indones. Conf. Comput. Inf. Technol. EIConCIT 2021, pp. 41–44, 2021, doi: 10.1109/EIConCIT50028.2021.9431845.
[17] B. A. Akinnuwesi et al., “Application of intelligence-based computational techniques for classification and early differential diagnosis of COVID-19 disease,” Data Sci. Manag., vol. 4, pp. 10–18, 2021, doi: https://doi.org/10.1016/j.dsm.2021.12.001.
[18] P. Cortez and A. Silva, “Using data mining to predict secondary school student performance,” 15th Eur. Concurr. Eng. Conf. 2008, ECEC 2008 - 5th Futur. Bus. Technol. Conf. FUBUTEC 2008, vol. 2003, no. 2000, pp. 5–12, 2008.
[19] H. Jiawei and M. Kamber, Data mining: concepts and techniques. 2001. doi: 10.1002/1521-3773(20010316)40:6<9823::AID-ANIE9823>3.3.CO;2-C.
[20] J. Ling and G. Li, “A two-level stacking model for detecting abnormal users in Wechat activities,” Proc. - 2019 Int. Conf. Inf. Technol. Comput. Appl. ITCA 2019, pp. 229–232, 2019, doi: 10.1109/ITCA49981.2019.00057.
[21] C. F. Kurz, W. Maier, and C. Rink, “A greedy stacking algorithm for model ensembling and domain weighting,” BMC Res. Notes, vol. 13, no. 1, p. 70, 2020, doi: 10.1186/s13104-020-4931-7.
[22] K. Leartpantulak and Y. Kitjaidure, “Music genre classification of audio signals using particle swarm optimization and stacking ensemble,” iEECON 2019 - 7th Int. Electr. Eng. Congr. Proc., pp. 1–4, 2019, doi: 10.1109/iEECON45304.2019.8938995.
[23] R. Sikora and O. Al-Laymoun, “A modified stacking ensemble machine learning algorithm using genetic algorithms,” Artif. Intell. Concepts, Methodol. Tools, Appl., vol. 1, no. 1, pp. 395–405, 2016, doi: 10.4018/978-1-5225-1759-7.ch016.
[24] S. Džeroski and B. Ženko, “Is combining classifiers with stacking better than selecting the best one?,” Mach. Learn., vol. 54, no. 3, pp. 255–273, 2004, doi: 10.1023/B:MACH.0000015881.36452.6e.
DOI: 10.30595/juita.v11i1.15498
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution 4.0 International License.
ISSN: 2579-8901
- Visitor Stats
View JUITA Stats