Used Car Price Prediction and Feature Importance Analysis Using XGBoost and SHAP
DOI:
https://doi.org/10.54097/kv2fnv60Keywords:
Machine learning regression used to evaluate used car pricing.Abstract
The used car market has rapidly expanded; therefore, the importance of the accurate and transparent pricing of a used car has increased. Traditional approaches to pricing usually rely on linear regression to infer a clear-cut mathematical relationship between vehicle attributes and pricing. This paper presents an optimized machine learning approach to predict the used car price using an Extreme Gradient Boosting (XGBoost) model. Above all, it always beats the baseline regression models. Using a dataset of 7253 records from India, it compares Linear regression, Ridge, Lasso, Random Forest, and XGBoost models. The findings from experiments reveal that XGBoost offers the highest predictive performance with a score of 0.87, and also possesses the lowest root mean squared error, compared to 0.80 for Linear Regression. To make our model more interpretable, it uses Shapley Additive Explanations (SHAP) to analyze how power, year, km driven, transmission, and engine influence price. The understanding and improvement of online pricing transparency will pave the way for consumers, dealerships, and online players to take effective action.
Downloads
References
[1] Bergmann S, Feuerriegel S. Machine learning for predicting used car resale prices using granular vehicle equipment information. Expert Systems with Applications, 2025, 263: 125640. DOI: https://doi.org/10.1016/j.eswa.2024.125640
[2] Huang J. Price prediction and analysis of price influencing factors for second-hand car sales in AutoTrader based on XGBoost algorithm. World Scientific Research Journal, 2025, 11(9).
[3] Darade Y. Predict used car prices using linear regression. Kaggle, 2021.
[4] Li C. Machine learning-based models for accurate car prices prediction. Highlights in Business, Economics and Management, 2024, 40. DOI: https://doi.org/10.54097/9zcpv779
[5] Pal N, Arora P, Sundararaman D, Kohli P, Palakurthy S. How much is my car worth? A methodology for predicting used car prices using RF. Future of Information and Communications Conference (FICC), 2017. DOI: https://doi.org/10.1007/978-3-030-03402-3_28
[6] Qian T. Used car price prediction by using XGBoost. BCP Business & Management, 2023, 44: FIBA 2023. DOI: https://doi.org/10.54691/bcpbm.v44i.4794
[7] Naveen Reddy S, Kumar S. A comparative analysis of XGBoost model and AdaBoost regressor for prediction of used car price. Proceedings of the 1st International Conference on Artificial Intelligence for Internet of Things: Accelerating Innovation in Industry and Consumer Electronics (AI4IoT), 2023: 441–446. DOI: https://doi.org/10.5220/0012510700003739
[8] Guo S, Zhang B. Revolutionizing the used car market: Predicting prices with XGBoost. Proceedings of the 4th International Conference on Signal Processing and Machine Learning, 2024, 48. DOI: https://doi.org/10.54254/2755-2721/48/20241349
[9] Wang H, Liang Q, Hancock J T, Khoshgoftaar T M. Feature selection strategies: A comparative analysis of SHAP-value and importance-based methods. Journal of Big Data, 2024, 11: 44. DOI: https://doi.org/10.1186/s40537-024-00905-w
[10] Fatima S, Hussain A, Amir S B, Ahmed S H, Aslam S M H. XGBoost and RF algorithms: An in-depth analysis. Pakistan Journal of Scientific Research, 2023, 3(1): 26–31. DOI: https://doi.org/10.57041/pjosr.v3i1.946
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Highlights in Business, Economics and Management

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







