Forecasting Ozone and PM2.5 Pollution Potentials Using Machine Learning Algorithms: A Case Study in Chengdu

doi:10.13209/j.0479-8023.2021.070

Acta Scientiarum Naturalium Universitatis Pekinensis ›› 2021, Vol. 57 ›› Issue (5): 938-950.DOI: 10.13209/j.0479-8023.2021.070

Previous Articles Next Articles

Forecasting Ozone and PM_2.5 Pollution Potentials Using Machine Learning Algorithms: A Case Study in Chengdu

WANG Xinlu¹, HUANG Ran^1,†, ZHANG Wenxian¹, LÜ Baolei², DU Yunsong³, ZHANG Wei³, LI Bolan³, HU Yongtao⁴#br#

1. Hangzhou AiMa Technologies, Hangzhou 311121 2. Huayun Sounding Meteorological Technology Company, Ltd., Beijing 102299 3. Sichuan Bio-Environmental Monitoring Center, Chengdu 610091 4. School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, GA 30332

Received:2020-07-31 Revised:2020-09-15 Online:2021-09-20 Published:2021-09-20
Contact: HUANG Ran, E-mail: ranhuang2019(at)163.com

基于机器学习方法的臭氧和PM_2.5污染潜势预报模型——以成都市为例

王馨陆¹, 黄冉^1,†, 张雯娴¹, 吕宝磊², 杜云松³, 张巍³, 李波兰³, 胡泳涛⁴

1. 杭州矮马科技有限公司, 杭州 311121 2. 华云升达(北京)气象科技有限责任公司, 北京 102299 3. 四川省生态环境监测总站, 成都 610091 4. School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, GA 30332

通讯作者: 黄冉, E-mail: ranhuang2019(at)163.com
基金资助:
国家重点研发计划(2018YFC0214004)、四川省环境保护科技计划(2019HB03)和四川省重大科技专项(2018SZDZX0023)资助

Abstract

Abstract:

Potential forecast models have been developed for air pollution of summertime (Apr.–Aug.) ozone and wintertime (Nov.–Feb.) PM_2.5 in Chengdu using the multiple linear regression (MLR), back-propagation (BP) neural network (NN) and random forest (RF) algorithms. The key predicting factors for each of the models are selected from various potential factors that may impact the spatiotemporal distribution of pollutions. The models are trained and established with 2016–2018 datasets and evaluated with a data-withheld method and further with independent 2019 dataset. The results show that the MLR, NN and RF models are all capable to accurately predict O₃ and PM_2.5 pollution potentials in short lead-time (1–3 days) in Chengdu. The models are also found having quite stable performances in medium- and long-term (7–15 days lead time) forecasts. Among the three models, the MLR model performs the best in prediction of O₃, while RF model performs the best for PM_2.5.

Key words: multiple linear regression, BP neural network, random forest, medium- and long-term air pollution potential forecast

摘要：

以成都市为例, 以多项可能影响污染物时空分布的变量为潜在预报因子, 筛选关键入模因子, 利用2016—2018年数据为训练集, 采用多元线性回归、BP神经网络和随机森林算法, 建立成都市夏季(4—8月)臭氧及冬季(11—2月) PM_2.5污染潜势模型, 并利用2019年数据对模型的中长期污染潜势浓度的预报性能进行评估。结果表明, 建立的多元线性回归、BP神经网络和随机森林模型对成都市臭氧及PM_2.5的短期(1~3天)污染潜势都具有良好的预报效果, 对7~15天的中长期潜势预报表现稳定。其中, 多元线性回归模型和随机森林模型分别对臭氧和PM_2.5表现出相对最佳的预报性能。

关键词: 多元线性回归, BP神经网络, 随机森林, 中长期潜势预报

WANG Xinlu, HUANG Ran, ZHANG Wenxian, LÜ Baolei, DU Yunsong, ZHANG Wei, LI Bolan, HU Yongtao. Forecasting Ozone and PM_2.5 Pollution Potentials Using Machine Learning Algorithms: A Case Study in Chengdu[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2021, 57(5): 938-950.

王馨陆, 黄冉, 张雯娴, 吕宝磊, 杜云松, 张巍, 李波兰, 胡泳涛. 基于机器学习方法的臭氧和PM_2.5污染潜势预报模型——以成都市为例[J]. 北京大学学报自然科学版, 2021, 57(5): 938-950.

Add to citation manager EndNote|Ris|BibTeX

URL: https://xbna.pku.edu.cn/EN/10.13209/j.0479-8023.2021.070

https://xbna.pku.edu.cn/EN/Y2021/V57/I5/938

[1]	CHEN Zhi, HUANG Ying, DING Jinshan, SHI Zhe, QIU Guoyu, YAN Chunhua. Simulation of Urban Evapotranspiration Considering Vegetation Coverage [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2022, 58(6): 1130-1140.
[2]	CHENG Junyi, ZHANG Xianfeng, SUN Min, LUO Peng, YANG Wanting. Random Forest Model for the Estimation of Fractional Vegetation Coverage Based on a UAV-Ground Co-Sampling Strategy [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2020, 56(1): 143-154.
[3]	REN Xiaochen, ZOU Silin, TANG Xian, WEI Jun. PM_2.5 Forecast of Beijing Based on Ensemble Empirical Mode Decomposition and BP Neural Network [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2019, 55(4): 615-625.
[4]	REN Tingyu, LIANG Zhongyao, CHEN Huili, LIU Yong. Clustering of Lake Variables Based on Pattern Recognition Method [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2019, 55(2): 335-341.
[5]	ZHOU Jianing, ZHANG Jie, LI Tianhong. Bashang Forest Change Monitoring with Multi-Temporal MODIS Images and Random Forest Algorithm [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2018, 54(4): 792-800.
[6]	SHI Zhongkui, LI Peijun, LUO Lun, YANG Ke. A Method for Extraction of Newly-Built Buildings in Road Region Using Morphological Attribute Profiles and One-Class Random Forest [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2018, 54(1): 105-114.
[7]	ZHOU Changling, CHEN Kai, GONG Xuxiao, CHEN Ping, MA Hao. Detection of Fast-Flux Domains Based on Passive DNS Analysis [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2016, 52(3): 396-402.
[8]	LIU Kan,YUAN Yunying,LIU Ping. A Weibo Bot-users Indentification Model Based on Random Forest [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2015, 51(2): 289-300.

Forecasting Ozone and PM_2.5 Pollution Potentials Using Machine Learning Algorithms: A Case Study in Chengdu

基于机器学习方法的臭氧和PM_2.5污染潜势预报模型——以成都市为例

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 8

Recommended Articles

Metrics

Forecasting Ozone and PM2.5 Pollution Potentials Using Machine Learning Algorithms: A Case Study in Chengdu

基于机器学习方法的臭氧和PM2.5污染潜势预报模型——以成都市为例

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 8

Recommended Articles

Metrics

Forecasting Ozone and PM_2.5 Pollution Potentials Using Machine Learning Algorithms: A Case Study in Chengdu

基于机器学习方法的臭氧和PM_2.5污染潜势预报模型——以成都市为例