Comparative Analysis of ARIMA and Random Forest Models for Forecasting COVID-19 Cases in China
DOI:
https://doi.org/10.54097/bga9mp76Keywords:
COVID-19 forecasting, ARIMA model, random forest model, time series, machine learning.Abstract
The COVID-19 pandemic has had a profound global impact on public health systems and global economic situation, making accurate forecasting of infection cases crucial for formulating effective intervention strategies. This study systematically compares the performance of the ARIMA time series model and the Random Forest machine learning model in predicting daily COVID-19 cases in China from 2020 to 2022. The data was partitioned into training and testing sets for model development and evaluation. Results indicate that the Random Forest model significantly outperforms the ARIMA model across all evaluated metrics, including residual mean, standard deviation, and key error indicators, demonstrating a superior ability to capture the timing and amplitude of infection peaks and troughs. Therefore, the value of this study lies in providing clear empirical evidence for model selection in epidemic prediction, indicating that in the face of complex epidemic data, machine learning models may be more reliable than traditional time series methods.
Downloads
References
[1] Wang L, et al. A comparative analysis of time series and machine learning models for COVID-19 forecasting. Journal of Medical Systems, 2022, 46 (4): 25.
[2] Chen J, Li K. Forecasting the COVID-19 pandemic: A comparative study of ARIMA and LSTM models. Journal of Healthcare Informatics Research, 2020, 4 (3): 210-225.
[3] Liu Y, Wang Z. Machine learning approaches for epidemic prediction: A case study of COVID-19. IEEE Transactions on Computational Social Systems, 2021, 8 (4): 890-901.
[4] Zhao X, Li X. Predicting COVID-19 outbreaks with random forest and mobility data. Scientific Reports, 2021, 11: 17921.
[5] Box G E P, Jenkins G M. Time series analysis: Forecasting and control. San Francisco: Holden-Day, 1970.
[6] Breiman L. Random forests. Machine Learning, 2001, 45 (1): 5-32. DOI: https://doi.org/10.1023/A:1010933404324
[7] Zhou L, et al. Evaluating the performance of ensemble methods in epidemic forecasting: Lessons from COVID-19. BMC Medical Informatics and Decision Making, 2022, 22 (1): 98.
[8] Petropoulos F, Makridakis S. Forecasting the novel coronavirus COVID-19. PLOS ONE, 2020, 15 (3): e0231236. DOI: https://doi.org/10.1371/journal.pone.0231236
[9] Hyndman R J, Athanasopoulos G. Forecasting: principles and practice. 2nd ed. Melbourne: OTexts, 2018. DOI: https://doi.org/10.32614/CRAN.package.fpp2
[10] Zhang G P. Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing, 2003, 50: 159-175. DOI: https://doi.org/10.1016/S0925-2312(01)00702-0
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Highlights in Business, Economics and Management

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







