Rain Prediction Clustering in Australia Using the K-Means Algorithm in the WEKA and RStudio Application

Dinar Ajeng Kristiyanti, Irwansyah Saputra, Rina Rina


Purpose: The purpose of this study is how to create an ideal cluster in predicting rainfall in Australia based on the percentage of the sum of squares error (SSE) using the K-Means algorithm with WEKA and RStudio applications.
Design/methodology/approach: The method or stages applied in predicting rain in Australia are through several stages including Data Collection, Data Pre-processing (including Missing Value handling in it), Data Mining Modeling by applying the K-Means Clustering algorithm using WEKA and RStudio, Validation results with SSE as well as Data Visualization using plots.
Findings/result: Based on the results obtained, clusters of 2 with an SSE of 28.0% are ideal clusters for predicting rain in Australia. In the WEKA software, rain clusters are represented by blue nodes, and non-rainy clusters are represented by red nodes. While in the RStudio software, rain clusters are represented by black nodes and non-rainy clusters are represented by red nodes.
Originality/value/state of the art: Get the ideal cluster in predicting rainfall in Australia by comparing the results obtained using the WEKA and RStudio applications.


Clustering; K-Means; WEKA; Rstudio; Rain Australia

