北京大学学报(自然科学版) ›› 2026, Vol. 62 ›› Issue (2): 253-265.DOI: 10.13209/j.0479-8023.2025.094

上一篇    下一篇

DrivingGym: 面向自动驾驶的跨仿真强化学习代理构建方法

聂子力1,2, 李俊泽1,2, 陈敬宇2, 董乾2,†, 薛云志2   

  1. 1. 中国科学院大学, 北京 101408 2. 中国科学院软件研究所, 北京 100190
  • 收稿日期:2025-02-21 修回日期:2025-03-21 出版日期:2026-03-20 发布日期:2026-03-20
  • 基金资助:
    中国科学院青年创新促进会项目资助

DrivingGym: Building Cross-Simulation Reinforcement Learning Agent for Autonomous Driving

NIE Zili1,2, LI Junze1,2, CHEN Jingyu2, DONG Qian2,†, XUE Yunzhi2   

  1. 1. University of Chinese Academy of Sciences, Beijing 101408 2. Institute of Software Chinese Academy of Sciences, Beijing 100190
  • Received:2025-02-21 Revised:2025-03-21 Online:2026-03-20 Published:2026-03-20

摘要:

针对自动驾驶强化学习(RL)在复杂场景中直接训练面临样本效率低下、收敛困难的问题, 提出一种基于统一数据表征的跨仿真代理构建方法, 并在此基础上实现DrivingGym训练环境。该方法将状态输入抽象为传感器、状态和路网3个层面, 通过动作适配器实现不同仿真环境的控制接口统一。在CARLA和Metadrive等常用仿真平台上的实验结果表明, 该方法能够支持RLlib和Stable-Baselines3等主流强化学习框架进行训练, 并实现简单环境到复杂环境跨仿真的自动驾驶策略应用。

关键词: 自动驾驶, 强化学习(RL), 跨仿真环境, 代理构建

Abstract:

Reinforcement learning (RL) for autonomous driving faces challenges such as low sample efficiency and convergence difficulties when directly trained in complex scenarios. To address this issue, we propose a cross-simulation agent construction method based on unified data representation and implement the DrivingGym training environment. This method abstracts the input state into three layers: sensor data, vehicle states, and road network information. The control interface unification is achieved across different simulation environments through action adapters. Experiments on common simulation platforms such as CARLA and Metadrive demonstrate that the proposed method can support training with mainstream reinforcement learning frameworks like RLlib and Stable-Baselines3, and enable cross-simulation application of autonomous driving policies from simple to complex scenarios.

Key words: autonomous driving, reinforcement learning (RL), cross-simulation environment, agent building