DrivingGym: 面向自动驾驶的跨仿真强化学习代理构建方法

doi:10.13209/j.0479-8023.2025.094

北京大学学报（自然科学版） ›› 2026, Vol. 62 ›› Issue (2): 253-265.DOI: 10.13209/j.0479-8023.2025.094

DrivingGym: 面向自动驾驶的跨仿真强化学习代理构建方法

聂子力^1,2, 李俊泽^1,2, 陈敬宇², 董乾^2,†, 薛云志²

1. 中国科学院大学, 北京 101408 2. 中国科学院软件研究所, 北京 100190

收稿日期:2025-02-21 修回日期:2025-03-21 出版日期:2026-03-20 发布日期:2026-03-20
基金资助:
中国科学院青年创新促进会项目资助

DrivingGym: Building Cross-Simulation Reinforcement Learning Agent for Autonomous Driving

NIE Zili^1,2, LI Junze^1,2, CHEN Jingyu², DONG Qian^2,†, XUE Yunzhi²

1. University of Chinese Academy of Sciences, Beijing 101408 2. Institute of Software Chinese Academy of Sciences, Beijing 100190

Received:2025-02-21 Revised:2025-03-21 Online:2026-03-20 Published:2026-03-20

摘要/Abstract

摘要：

针对自动驾驶强化学习(RL)在复杂场景中直接训练面临样本效率低下、收敛困难的问题, 提出一种基于统一数据表征的跨仿真代理构建方法, 并在此基础上实现DrivingGym训练环境。该方法将状态输入抽象为传感器、状态和路网3个层面, 通过动作适配器实现不同仿真环境的控制接口统一。在CARLA和Metadrive等常用仿真平台上的实验结果表明, 该方法能够支持RLlib和Stable-Baselines3等主流强化学习框架进行训练, 并实现简单环境到复杂环境跨仿真的自动驾驶策略应用。

关键词: 自动驾驶, 强化学习(RL), 跨仿真环境, 代理构建

Abstract:

Reinforcement learning (RL) for autonomous driving faces challenges such as low sample efficiency and convergence difficulties when directly trained in complex scenarios. To address this issue, we propose a cross-simulation agent construction method based on unified data representation and implement the DrivingGym training environment. This method abstracts the input state into three layers: sensor data, vehicle states, and road network information. The control interface unification is achieved across different simulation environments through action adapters. Experiments on common simulation platforms such as CARLA and Metadrive demonstrate that the proposed method can support training with mainstream reinforcement learning frameworks like RLlib and Stable-Baselines3, and enable cross-simulation application of autonomous driving policies from simple to complex scenarios.

Key words: autonomous driving, reinforcement learning (RL), cross-simulation environment, agent building

聂子力, 李俊泽, 陈敬宇, 董乾, 薛云志. DrivingGym: 面向自动驾驶的跨仿真强化学习代理构建方法[J]. 北京大学学报（自然科学版）, 2026, 62(2): 253-265.

NIE Zili, LI Junze, CHEN Jingyu, DONG Qian, XUE Yunzhi. DrivingGym: Building Cross-Simulation Reinforcement Learning Agent for Autonomous Driving[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2026, 62(2): 253-265.

导出引用管理器 EndNote|Ris|BibTeX

链接本文: https://xbna.pku.edu.cn/CN/10.13209/j.0479-8023.2025.094

https://xbna.pku.edu.cn/CN/Y2026/V62/I2/253

[1]	武和全, 张凯, 胡林, 米海林. 自动驾驶汽车中安全气囊对不同座椅朝向乘员保护策略研究[J]. 北京大学学报（自然科学版）, 2026, 62(1): 44-56.
[2]	许卓栋兰逸舟尚可万泽宇张飞舟. 基于单发多框机制的路端目标检测算法研究[J]. 北京大学学报（自然科学版）, 2025, 61(4): 746-754.
[3]	李浩, 康柳江, 罗斯达, 孙会君, 吴建军. 人工和自动混合驾驶环境下的最优道路拥堵收费模型[J]. 北京大学学报自然科学版, 2023, 59(5): 747-756.

DrivingGym: 面向自动驾驶的跨仿真强化学习代理构建方法

DrivingGym: Building Cross-Simulation Reinforcement Learning Agent for Autonomous Driving

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 3

编辑推荐

Metrics

留言