Application of Data Mining for Rainfall Prediction Classification in Australia with Decision Tree Algorithm and C5.0 Algorithm

Irwansyah Saputra, Dinar Ajeng Kristiyanti

Abstract


Tujuan: Penelitian ini bertujuan untuk memprediksi hujan di Australia dengan pendekatan klasifikasi machine learning. Prediksi hujan yang tepat dan akurat sangat penting untuk perencanaan dan pengelolaan sumber daya air, peringatan banjir, kegiatan konstruksi dan operasi penerbangan serta yang lainnya.
Perancangan/metode/pendekatan: Metode atau tahapan yang diterapkan dalam melakukan klasifikasi prediksi hujan di Australia yaitu melalui beberapa tahapan diantaranya Pengumpulan Data, Data Pre-processing (termasuk dilakukan penanganan Missing Value didalamnya), Pemodelan Klasifikasi dengan menerapkan dan membandingkan algoritme Decision Tree dan C5.0, Validasi Hasil menggunakan Partisi Dataset dan k-Cross Fold Validation serta Evaluasi Model menggunakan Confussion Matrix.
Hasil: Berdasarkan hasil yang diperoleh, evaluasi menggunakan 10-Cross Fold Validation lebih unggul yang memiliki akurasi paling tinggi sebesar 87.35% untuk algoritme Decision Tree dan akurasi sebesar 86.85% untuk algoritme C5.0 Rule-Based Model, dibandingkan dengan metode Split 80:20 pada kasus prediksi hujan di Australia.
Keaslian/state of the art: Selain model klasifikasi yang digunakan, validasi dataset baik itu dengan partisi dataset atau k-Cross Fold Validation juga dapat mempengaruhi akurasi hasil prediksi.


Keywords


Classification;Decision Tree; C5.0; Rstudio; Rain Australia

Full Text:

PDF

References


Anonim, “Cuaca di Australia,” Tourism Australia, 2020. http://australiaxy.com/id-id/facts-and-planning/weather-in-australia.html.

P. Yasmin, “Karakteristik, Iklim dan Daftar Negara di Benua Australia,” Detiktravel. Detik.com, 2020, [Online]. Available: https://travel.detik.com/travel-news/d-5158330/karakteristik-iklim-dan-daftar-negara-di-benua-australia.

C. Thirumalai, K. S. Harsha, M. L. Deepak, and K. C. Krishna, “Heuristic prediction of rainfall using machine learning techniques,” in Proceedings - International Conference on Trends in Electronics and Informatics, ICEI 2017, 2018, vol. 2018-January, pp. 1114–1117, doi: 10.1109/ICOEI.2017.8300884.

A. M. Bagirov, A. Mahmood, and A. Barton, “Prediction of monthly rainfall in Victoria, Australia: Clusterwise linear regression approach,” Atmos. Res., vol. 188, pp. 20–29, 2017, doi: 10.1016/j.atmosres.2017.01.003.

A. M. Bagirov and A. Mahmood, “A Comparative Assessment of Models to Predict Monthly Rainfall in Australia,” Water Resour. Manag., vol. 32, no. 5, pp. 1777–1794, 2018, doi: 10.1007/s11269-018-1903-y.

S. Aftab, M. Ahmad, N. Hameed, M. S. Bashir, I. Ali, and Z. Nawaz, “Rainfall prediction in Lahore City using data mining techniques,” in International Journal of Advanced Computer Science and Applications, 2018, vol. 9, no. 4, pp. 254–260, doi: 10.14569/IJACSA.2018.090439.

M. P. Darji, V. K. Dabhi, and H. B. Prajapati, “Rainfall forecasting using neural network: A survey,” in Conference Proceeding - 2015 International Conference on Advances in Computer Engineering and Applications, ICACEA 2015, 2015, no. December, pp. 706–713, doi: 10.1109/ICACEA.2015.7164782.

G. Sethupathi M, Y. S. Ganesh, and M. M. Ali, “Efficient Rainfall Prediction and Analysis using Machine Learning Techniques,” Turkish J. Comput. Math. Educ., vol. 12, no. 6, pp. 3467–3474, 2021.

D. A. Kristiyanti, E. Purwaningsih, E. Nurelasari, A. Al Kaafi, and A. H. Umam, “Implementation of Neural Network Method for Air Quality Forecasting in Jakarta Region,” J. Phys. Conf. Ser., vol. 1641, no. 1, 2020, doi: 10.1088/1742-6596/1641/1/012037.

A. Dikshit, B. Pradhan, and A. M. Alamri, “Short-term spatio-temporal drought forecasting using random forests model at New South Wales, Australia,” Appl. Sci., vol. 10, no. 12, 2020, doi: 10.3390/app10124254.

V. A. Vuyyuru, G. Apparao, and S. Anuradha, “Prediction Of Rainfall With A Machine Learning Approach,” Turkish J. Comput. Math. Educ., vol. 12, no. 7, pp. 1762–1776, 2021.

S. Zainudin, D. S. Jasim, and A. A. Bakar, “Comparative analysis of data mining techniques for malaysian rainfall prediction,” Int. J. Adv. Sci. Eng. Inf. Technol., vol. 6, no. 6, pp. 1148–1153, 2016, doi: 10.18517/ijaseit.6.6.1487.

M. Marjanović, M. Krautblatter, B. Abolmasov, U. Đurić, C. Sandić, and V. Nikolić, “The rainfall-induced landsliding in Western Serbia: A temporal prediction approach using Decision Tree technique,” Eng. Geol., vol. 232, no. February 2017, pp. 147–159, 2018, doi: 10.1016/j.enggeo.2017.11.021.

A. Geetha and G. M. Nasira, “Data mining for meteorological applications: Decision trees for modeling rainfall prediction,” in 2014 IEEE International Conference on Computational Intelligence and Computing Research, IEEE ICCIC 2014, 2015, pp. 0–3, doi: 10.1109/ICCIC.2014.7238481.

R. N, S. S, and K. S, “Comparison of Decision Tree Based Rainfall Prediction Model with Data Driven Model Considering Climatic Variables,” Irrig. Drain. Syst. Eng., vol. 05, no. 03, 2016, doi: 10.4172/2168-9768.1000175.

J. Dou et al., “Assessment of advanced random forest and decision tree algorithms for modeling rainfall-induced landslide susceptibility in the Izu-Oshima Volcanic Island, Japan,” Sci. Total Environ., vol. 662, pp. 332–346, 2019, doi: 10.1016/j.scitotenv.2019.01.221.

W. Wei, Z. Yan, and P. D. Jones, “A decision-tree approach to seasonal prediction of extreme precipitation in eastern China,” Int. J. Climatol., vol. 40, no. 1, pp. 255–272, 2020, doi: 10.1002/joc.6207.

N. Prasad, P. Kumar, and M. M. Naidu, “An approach to prediction of precipitation using Gini Index in SLIQ decision tree,” Proc. - Int. Conf. Intell. Syst. Model. Simulation, ISMS, pp. 56–60, 2013, doi: 10.1109/ISMS.2013.27.

E. Kurniawan, F. Nhita, A. Aditsania, and D. Saepudin, “C5.0 algorithm and synthetic minority oversampling technique (SMOTE) for rainfall forecasting in bandung regency,” 2019 7th Int. Conf. Inf. Commun. Technol. ICoICT 2019, vol. 4, pp. 1–5, 2019, doi: 10.1109/ICoICT.2019.8835324.

N. Patil, R. Lathi, and V. Chitre, “Customer Card Classification Based on C5 . 0 & CART Algorithms,” Int. J. Eng. Res. Appl., vol. 66, no. 3, pp. 37–39, 2012.

J. Young, “Rain in Australia,” Kagle.com, 2018. https://www.kaggle.com/jsphyg/weather-dataset-rattle-package.

J. M. Frederic Lardinois, Matthew Lynley, “Google is acquiring data science community Kaggle,” 2017.

I. Aprillani, A. A. Suryani, F. T. Informatika, and U. Telkom, “Analisis Penanganan Missing Value Dengan Metode Predictive Mean Matching (Pmm),” 2012.

State of New York, “New York State Index Crimes,” Kagle.com, 2019. https://www.kaggle.com/new-york-state/new-york-state-index-crimes/metadata.

F. Provost and T. Fawcett, “Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Distributions,” in THE THIRD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 1997, pp. 43–48.


Refbacks

  • There are currently no refbacks.