Datasets
The VeReMi dataset family provides simulated V2X message logs for evaluating Misbehavior Detection Systems (MDSs) in Vehicular Ad hoc Networks (VANETs). The datasets support reproducible and comparable evaluations of mechanisms that detect incorrect or malicious information in otherwise authentic V2X messages.
This page provides an overview of the three main VeReMi dataset releases:
Dataset Versions
| Dataset | Year | Short Description |
|---|---|---|
| VeReMi NextGen | 2026 | Next-generation dataset with modern traffic scenarios, predefined train/validation/test sets, broader attack coverage, and an extensible dataset generator |
| VeReMi Extension | 2020 | Extended version with sensor error models, larger datasets, and additional attacks |
| VeReMi | 2018 | Initial reference dataset for comparable evaluation of MDSs in VANETs |
Evolution of the VeReMi Datasets
VeReMi NextGen (2026)
VeReMi NextGen is the latest and most comprehensive version of the VeReMi dataset family. It provides simulated V2X message logs with ground-truth labels and predefined training, validation, and test sets for reproducible evaluation.
The dataset was generated using Eclipse MOSAIC, SUMO, OMNeT++, and the InTAS traffic scenario. Compared to previous VeReMi datasets, it introduces more realistic and diverse traffic conditions, including urban and highway environments, low- and high-density scenarios, and multiple driver profiles.
VeReMi NextGen extends the attack coverage. It includes 15 attack types affecting position, speed, acceleration, heading, and time-related attributes. Six of these attacks were newly introduced. The dataset is accompanied by a publicly available and extensible dataset generator, enabling future extensions with additional attacks, attributes, and entities such as VRUs.
Why VeReMi NextGen?
VeReMi NextGen is the most comprehensive and realistic release of the VeReMi dataset family. It was designed to address key limitations of previous datasets, including outdated traffic scenarios, limited attack diversity, missing predefined data splits, and limited support for machine-learning-based MDS evaluation.
The dataset introduces several major improvements:
- modern urban and highway scenarios based on the InTAS traffic scenario
- three heterogeneous driver profiles: normal, cautious, and aggressive
- predefined training, validation, and test sets for reproducible ML-based evaluation
- 15 attack types, including six newly introduced attacks
- attacks affecting a broader range of attributes, including position, speed, acceleration, heading, and time
- realistic sensor error models
- ground-truth labels directly included in each message
- an extensible dataset generator for adding future attacks, attributes, and entities such as Vulnerable Road Users (VRUs)
VeReMi NextGen Highlights
| Advantage | Description |
|---|---|
| More realistic traffic scenario | Uses InTAS instead of LuST, providing more complex road layouts and more realistic traffic dynamics |
| Urban and highway coverage | Includes both urban and highway scenarios with low and high vehicle densities |
| Heterogeneous driver behavior | Introduces normal, cautious, and aggressive driver profiles |
| Broader attack diversity | Provides 15 attack types affecting a wider range of message attributes |
| New attack types | Includes six newly introduced attacks, such as position mirroring, zero speed report, reversed heading, feigned braking, and acceleration multiplication |
| ML-ready structure | Provides predefined training, validation, and test sets |
| Easier evaluation | Includes ground-truth labels directly in each message |
| Extensible design | Provides a public dataset generator for adding new attacks, attributes, or entities |
| Future VRU support | Based on InTAS, which supports Vulnerable Road User simulation and enables future dataset extensions |
| More challenging benchmark | Evaluation results show that attacks in VeReMi NextGen are harder to detect than in VeReMi Extension |
Citing VeReMi NextGen
@inproceedings{Hermann2026vereminextgen,
author = {Hermann, Artur and Remmers, Jan-Niklas and Eisermann, Dennis and Erb, Benjamin and Kargl, Frank},
title = {VeReMi {NextGen}: A {Dataset} for {Evaluating} {Misbehavior} {Detection} {Systems} in {VANETs}},
booktitle = {2026 {IEEE} {Vehicular} {Networking} {Conference} ({VNC})},
date = {2026-06},
location = {Montreal, Canada}
}
VeReMi Extension (2020)
VeReMi Extension builds on the original VeReMi dataset and addresses several of its limitations. It provides larger simulated V2X message logs for evaluating misbehavior detection mechanisms.
The dataset was generated using the F2MD framework and is based on the LuST scenario. It introduces realistic sensor error models, two traffic density levels, and additional attack scenarios that affect not only position data but also other attributes such as speed.
Citing VeReMi NextGen
@inproceedings{02492739,
title = {VeReMi Extension: A Dataset for Comparable Evaluation of Misbehavior Detection in VANETs},
author={J. {Kamel} and M. {Wolf} and R. W. {van Der Heijden} and A. {Kaiser} and P. {Urien} and F. {Kargl}},
booktitle = {2020 IEEE International Conference on Communications (ICC)},
address = {Dublin, Ireland},
year = {2020},
month = {Jun}
}
VeReMi (2018)
VeReMi is the first version of the Vehicular Reference Misbehavior dataset family. It provides simulated V2X message logs with ground-truth labels for evaluating misbehavior detection mechanisms.
The dataset was generated using LuST and VEINS. It was designed as an initial city-scale baseline and includes three traffic density levels and five attack types focusing on position-based attacks.
Citing VeReMi
@inproceedings{van2018veremi,
title={VeReMi: A Dataset for Comparable Evaluation of Misbehavior Detection in VANETs},
author={Van Der Heijden, Rens W and Lukaseder, Thomas and Kargl, Frank},
booktitle={International conference on security and privacy in communication systems},
pages={318--337},
year={2018},
organization={Springer}
}
Comparison of the Different VeReMi Datasets
Features
| Feature | VeReMi | VeReMi Extension | VeReMi NextGen |
|---|---|---|---|
| Up-to-date traffic scenario | ✗ | ✗ | ✓ |
| Multiple driver profiles | ✗ | ✗ | ✓ |
| Multi-attribute attacks | ✗ | ✓ | ✓ |
| Urban and highway scenarios | ✗ | ✗ | ✓ |
| Sensor error models | ✗ | ✓ | ✓ |
| Received Signal Strength Indicator | ✓ | ✗ | ✗ |
| Ground-truth labels | ✓ | ✗ | ✓ |
| Training/validation/test sets | ✗ | ✗ | ✓ |
| Extensible design | ✗ | ✗ | ✓ |
| Support for future VRU integration | ✗ | ✗ | ✓ |
Simulation Frameworks and Scenarios
| Dataset | Simulation Framework | Traffic Scenario |
|---|---|---|
| VeReMi | VEINS, OMNeT++, SUMO | LuST |
| VeReMi Extension | F2MD, VEINS, OMNeT++, SUMO | LuST |
| VeReMi NextGen | Eclipse MOSAIC, SUMO, OMNeT++ | InTAS |
Attack Coverage
| Dataset | Attack Coverage |
|---|---|
| VeReMi | Basic attack set focused primarily on position manipulation |
| VeReMi Extension | Extended attack set covering position, speed, timing, replay, DoS, and Sybil-based attacks |
| VeReMi NextGen | Comprehensive attack set covering position, speed, timing, replay, DoS, and Sybil-based attacks, including six newly introduced attack types |