Goal
End-to-end autonomous driving has become a major focus in both academia and industry, creating a growing need for benchmarks that evaluate robustness, safety, and generalization in safety-critical corner cases. Yet existing benchmarks still have clear limitations. Semi-closed-loop benchmarks such as NAVSIM rely mainly on data replay and cannot fully capture closed-loop interaction dynamics and compounding errors. Closed-loop benchmarks such as Bench2Drive support interactive evaluation, but remain limited in scenario diversity, long-tail coverage, and metric design for complex real-world interactions. To address these gaps, the 2026 CVCI Benchmark Challenge—Bench2InterActDrive is organized during the conference, with a focus on more diverse safety-critical scenarios, richer long-tail interactions, and a more targeted evaluation protocol.
This benchmark is built upon the Bench2Drive framework and adopts a closed-loop evaluation protocol. It contains 12 categories of challenging driving scenarios, with 144 scenario instances in total, designed to assess the performance of end-to-end driving models under safety-critical and extreme interaction conditions. The scenario design is informed by representative intelligent driving test cases from real-world public evaluations, encompassing challenging urban and highway edge cases such as static obstacles, vulnerable road users, and extreme interactions.
Compared with existing benchmarks, this challenge emphasizes:
1. Closed-loop evaluation, enabling assessment of full driving behavior rather than offline prediction only.
2. Scenario diversity, covering multiple types of safety-critical and complex driving situations.
3. Fair and transparent evaluation, through a unified testing protocol and a hybrid evaluation protocol tailored to the characteristics of each scenario.
Participants are encouraged to develop and submit their autonomous driving models, and may also submit a position paper or regular paper describing their methods, insights, and experimental results.
Procedure
1. Please download the benchmark challenge description file [pdf], which introduces the challenge tasks, scenario settings, evaluation protocol, submission requirements, and other technical details. The benchmark codebase and related updates are available on GitHub: https://github.com/55sleeper/CVCI_BenchMark, where participants can access the benchmark and conduct evaluation.
2. Participants should complete the registration form [xlsx] and send it to the organizing committee by email for registration and benchmark access request.
3. Participants are required to submit their models and results file before the final submission deadline. The final results will be evaluated by the organizers using the official evaluation system. The official evaluation adopts a hybrid scoring protocol: the original Bench2Drive scenarios are scored using the official Bench2Drive metric, while the CVCI-designed scenarios are scored using the customized scenario-aware evaluation metric. The final ranking score is computed as a weighted combination of these two parts.
4. For selected scenarios, additional data resources or supplementary materials may be released later to support further benchmark development and baseline construction.
Notice
1. A special session will be organized for this benchmark challenge at CVCI 2026. Participants are encouraged to register for the conference and submit a position paper or regular paper to present their methods in detail.
2. Teams will be ranked based on the official benchmark evaluation results conducted by the organizers. The final ranking score is computed as a weighted combination of performance on the original Bench2Drive scenarios and the customized CVCI scenarios. According to these results, teams with outstanding performance will be recognized.
3. The benchmark focuses on evaluating the safety, robustness, and generalization of end-to-end autonomous driving systems in challenging scenarios. The official protocol jointly considers general closed-loop driving performance on the original Bench2Drive scenarios and scenario-specific behavioral quality on the CVCI benchmark scenarios. This design helps prevent overly conservative strategies, such as braking-only behavior, from obtaining unfairly high scores. Participants should ensure that their submitted methods are reproducible and can be evaluated under the official protocol.
4. The organizers reserve the right to verify submitted results and request additional materials when necessary to ensure fairness and transparency.
Schedule
Open the problem with data: April 24th, 2026
Submission of paper and challenging results: July 1st, 2026
Final results submission: September 1st, 2026
Download
2. Benchmark problem description file [pdf]
Contact Person
Dr. Hongqing Chu
Tongji University, Shanghai, China.
CVCI_Challenge@163.com (Alternate: CVCI_Challenge@outlook.com)
Technically Co-Sponsored
This benchmark provides a unified platform for evaluating end-to-end autonomous driving systems under closed-loop interaction. Its main features include:
1. Benchmark foundation: built upon Bench2Drive
2. Evaluation mode: closed-loop evaluation
3. Scenario scale: 12 categories, 144 scenario instances
4. Scenario source inspiration: real-world intelligent driving test cases
5. Focus: safety-critical and long-tail corner cases
6. Metrics: a hybrid evaluation protocol combining the official Bench2Drive score for original benchmark scenarios and a customized scenario-aware score for CVCI scenarios.
The challenge aims to encourage the community to move beyond conventional average-case evaluation and towards more rigorous, diverse, and safety-oriented assessment of autonomous driving models.
Copyright © 2026 The 10th CAA International Conference on Vehicular Control and Intelligence (CVCI2026)