Advancing Used Car Price Prediction in South Africa: An Empirical Examination of Machine Learning Techniques

Authors

  • Zenzele Abel Msiza
  • Pius Adewale Owolawi

Keywords:

used car price, Decision Tree, Random Forest, Gradient Boosted Trees Regressor, Artificial Neural Network

Abstract

The purpose of this study was to compile historical data from Demo automobiles in South Africa to build a prediction model that could predict the prices of second-hand vehicles. The developed approach was designed to serve as a facilitator for sellers and buyers in the used car industry. The dataset for this study was obtained from the Demo automobiles website. Several machine learning approaches were utilized in creating the prediction model, with the best algorithm chosen based on the R-Squared and RMSE (Root Mean Squared Error) performance metrics. Prior to modelling, data cleaning was conducted, which involved identifying null values, filling in gaps, and eliminating outliers. Data normalization was then performed on the collected data during data preprocessing. The data pool was subsequently divided into a training subset (comprising 80% of the data) and a testing subset (comprising the remaining 20% of the data). The RMSE and R-Squared measures were used to evaluate each machine learning method employed in this study, including Linear Regression, Decision Tree, Random Forest, Gradient Boosted Trees Regressor, Artificial Neural Network, and K-Nearest Neighbors. The Random Forest method demonstrated significant superiority in terms of algorithm performance results, with an R-Squared value of 0.988 and an RMSE value of 0.019, but the other algorithms also yielded excellent results.

https://doi.org/10.59200/ICARTI.2023.027

Downloads

Published

2023-12-10