Acta Scientiarum Naturalium Universitatis Pekinensis ›› 2018, Vol. 54 ›› Issue (6): 1227-1234.DOI: 10.13209/j.0479-8023.2018.070

Previous Articles     Next Articles

Efficient Traffic Flow Data Processing Method and Its Application Based on Spark Framework

LI Xin1,2   

  1. 1. Collaborative Innovation Center of Three-Aspect Coordination of Central Plain Economic Region, Henan University of Economics and Law, Zhengzhou 450046
    2. College of Resource and Environment, Henan University of Economics and Law, Zhengzhou 450046
  • Received:2017-10-11 Revised:2017-12-01 Online:2018-11-20 Published:2018-11-20
  • Contact: LI Xin, E-mail: lixin992319(at)163.com

Spark框架下交通流数据高效处理方法及其应用

李欣1,2   

  1. 1. 河南财经政法大学中原经济区“三化”协调发展河南省协同创新中心, 郑州 450046
    2. 河南财经政法大学资源与环境学院, 郑州 450046
  • 通讯作者: 李欣, E-mail: lixin992319(at)163.com
  • 基金资助:
    国家自然科学基金(41501178, 41771445)和河南财经政法大学博士科研基金(800257)资助

Abstract:

A traffic flow data processing and forecasting framework based on Spark is designed, and it can complete the efficient cleaning, statistics, storage and query of traffic flow data. A multi-order spatial weight matrix STARIMA model is used to predict the traffic flow, and it can verify the efficiency of data processing and the support for the prediction. By comparative experiments, the results show that the traffic flow data processing framework is efficient, and it is suitable for realizing complex data cleaning and mining algorithms and establishing data support for the prediction model. The traffic flow prediction model optimizes the multi-order spatial weight matrix, and it takes both efficiency and accuracy into consideration. The prediction results can provide reference for traffic guidance.

Key words: Spark, data cleaning, semantic query, spatial weighting matrix, traffic flow prediction

摘要:

设计并实现基于Spark的交通流数据处理与预测分析应用框架, 可以完成交通流数据的高效清洗、统计、存储和查询。利用基于多阶空间权重矩阵的STARIMA模型进行交通流预测分析, 可以验证数据处理效率及对预测应用的支撑作用。对比实验结果表明: 1) 交通流数据处理框架运行效率高, 适用于复杂的数据清洗和挖掘算法, 为预测模型建立数据支撑; 2) 交通流预测模型对空间权重矩阵进行了多阶优化, 兼顾高效性和准确性, 预测分析结果可以为交通诱导提供参考。

关键词: Spark, 数据清洗, 语义查询, 空间权重矩阵, 交通流预测

CLC Number: