Performance Analysis of XGBoost Algorithm to Determine the Most Optimal Parameters and Features in Predicting Stock Price Movement

Affan Ardana

Abstract


Purpose: The research aims to find the best parameters and features for predicting stock price movement using the XGBoost algorithm. The parameters are searched using the RMSE value, and the features are searched using the importance value.

Design/methodology/approach: The research data is the stock data of Amazon.com company (AMZN). The dataset contains the Date, Low, Open, Volume, High, Close, and Adjusted Close features. The dataset is ensured to have no missing data by handling missing values. The input feature is selected using the Pearson Correlation feature selection method. To prevent the difference between the highest and lowest stock price from being too far apart, the data is scaled using the scaling method. To avoid bias that may appear in the prediction result, cross-validation is used with the Min Max Scaling method, which will devide the dataset into training data and testing data within a range of 30 days after the training data. The parameters to be tested include n_estimator = 500, early stopping round = 3, learning rate = 0.01, 0.05, 0.1, and max_depth (tree depth) = 3, 4, 5.

Findings/result: The result of the research that a learning rate of 0.05 and a tree depth of 5 obtained the lowest RMSE result compared to other models, with an RMSE of 0.009437. The Low feature obtained the highest importance value among all the models built.

Originality/value/state of the art: This study used testing data within a range of 30 days after the training data and used a combination of parameters, including n_estimator = 500, early stopping round = 3, learning rate = 0.01, 0.05, 0.1, amd max_depth (tree depth) = 3, 4, 5.

 


Keywords


machine learning; regresi; prediksi; saham; xgboost

Full Text:

PDF

References


Y. Ramdhani and A. Mubarok, “Analisis Time Series Prediksi Penutupan Harga Saham Antm.Jk Dengan Algoritma SVM Model Regresi,” JURNAL RESPONSIF, vol. 1, no. 1, 2019, [Online]. Available: http://ejurnal.univbsi.id/index.php/jti

A. Mulfita, I. Yusra, S. Tinggi, and I. E. Kbp, “ANALISIS REGRESI DATA PANEL TERHADAP LIKUIDITAS SAHAM DI INDONESIA.”

W. Y. Rusyida and V. Y. Pratama, “Prediksi Harga Saham Garuda Indonesia di Tengah Pandemi Covid-19 Menggunakan Metode ARIMA,” Square : Journal of Mathematics and Mathematics Education, vol. 2, no. 1, p. 73, Apr. 2020, doi: 10.21580/square.2020.2.1.5626.

I. Puspitasari and Y. Wilandari, “ANALISIS INDEKS HARGA SAHAM GABUNGAN (IHSG) DENGAN MENGGUNAKAN MODEL REGRESI KERNEL,” 2012. [Online]. Available: http://ejournal-s1.undip.ac.id/index.php/gaussian

J. Wu, K. Xu, X. Chen, S. Li, and J. Zhao, “Price graphs: Utilizing the structural information of financial time series for stock prediction,” Jun. 2021, [Online]. Available: http://arxiv.org/abs/2106.02522

A. FAUZI, “FORECASTING SAHAM SYARIAH DENGAN MENGGUNAKAN LSTM,” Al-Masraf : Jurnal Lembaga Keuangan dan Perbankan, vol. 4, no. 1, p. 65, Jun. 2019, doi: 10.15548/al-masraf.v4i1.235.

E. H. Yulianti, O. Soesanto, and Y. Sukmawaty, “Penerapan Metode Extreme Gradient Boosting (XGBOOST) pada Klasifikasi Nasabah Kartu Kredit,” JOMTA Journal of Mathematics: Theory and Applications, vol. 4, no. 1, 2022.

M. Ridwan, “Penerapan Data Mining Untuk Evaluasi Kinerja Akademik Mahasiswa Menggunakan Algoritma Naive Bayes Classifier”, doi: https://doi.org/10.21776/jeeccis.v7i1.204.

P. Meilina, “PENERAPAN DATA MINING DENGAN METODE KLASIFIKASI MENGGUNAKAN DECISION TREE DAN REGRESI,” 2015.

S. Lorena, B. Ginting, W. Zarman, and I. Hamidah, “DATA MINING UNTUK MEMPREDIKSI MASA STUDI MAHASISWA BERDASARKAN DATA NILAI AKADEMIK,” 2014.

V.G.Utomo, N.Wakhidah, and A.N.Putri, “Prediksi Harga Saham Dengan Svm(Support Vector Machine) Dan Pemilihan Fitur F-Score,” 2020.

J. Fan et al., “Comparison of Support Vector Machine and Extreme Gradient Boosting for predicting daily global solar radiation using temperature and precipitation in humid subtropical climates: A case study in China,” Energy Convers Manag, vol. 164, pp. 102–111, May 2018, doi: 10.1016/j.enconman.2018.02.087.

T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug. 2016, vol. 13-17-August-2016, pp. 785–794. doi: 10.1145/2939672.2939785.

A. Bijaksana Putra Negara, “Penerapan Algoritma Model Regresi pada Angka New Active Cases Covid-19 di Indonesia,” vol. 01, no. 1, 2022, doi: 10.26418/juara.v1i1.59941.

N. Nyoman, P. Pinata, M. Sukarsa, N. Kadek, and D. Rusjayanthi, “Prediksi Kecelakaan Lalu Lintas di Bali dengan XGBoost pada Python.”




DOI: https://doi.org/10.31315/telematika.v20i1.9329

DOI (PDF): https://doi.org/10.31315/telematika.v20i1.9329.g5411

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Copyright of :
TELEMATIKA: Jurnal Informatika dan Teknologi Informasi
ISSN 1829-667X (print); ISSN 2460-9021 (online)


Dipublikasi oleh
Jurusan Teknik Informatika, UPN Veteran Yogyakarta
Jl. Babarsari 2 Yogyakarta 55281 (Kampus Unit II)
Telp: +62 274 485786
email: jurnaltelematika@upnyk.ac.id

 

Jurnal Telematika sudah diindeks oleh beberapa lembaga berikut:
 

 

 

 

 

Status Kunjungan Jurnal Telematika