VeReMi Datasets

The VeReMi dataset family provides simulated V2X message logs for evaluating Misbehavior Detection Systems (MDSs) in Vehicular Ad hoc Networks (VANETs). The datasets support reproducible and comparable evaluations of mechanisms that detect incorrect or malicious information in otherwise authentic V2X messages.

This page provides an overview of the three main VeReMi dataset releases:

Dataset Versions

Dataset Year Short Description
VeReMi NextGen 2026 Next-generation dataset with modern traffic scenarios, predefined train/validation/test sets, broader attack coverage, and an extensible dataset generator
VeReMi Extension 2020 Extended version with sensor error models, larger datasets, and additional attacks
VeReMi 2018 Initial reference dataset for comparable evaluation of MDSs in VANETs

Evolution of the VeReMi Datasets

VeReMi NextGen (2026)

Code Dataset License

VeReMi NextGen is the latest and most comprehensive version of the VeReMi dataset family. It provides simulated V2X message logs with ground-truth labels and predefined training, validation, and test sets for reproducible evaluation.

The dataset was generated using Eclipse MOSAIC, SUMO, OMNeT++, and the InTAS traffic scenario. Compared to previous VeReMi datasets, it introduces more realistic and diverse traffic conditions, including urban and highway environments, low- and high-density scenarios, and multiple driver profiles.

VeReMi NextGen extends the attack coverage. It includes 15 attack types affecting position, speed, acceleration, heading, and time-related attributes. Six of these attacks were newly introduced. The dataset is accompanied by a publicly available and extensible dataset generator, enabling future extensions with additional attacks, attributes, and entities such as VRUs.

Why VeReMi NextGen?

VeReMi NextGen is the most comprehensive and realistic release of the VeReMi dataset family. It was designed to address key limitations of previous datasets, including outdated traffic scenarios, limited attack diversity, missing predefined data splits, and limited support for machine-learning-based MDS evaluation.

The dataset introduces several major improvements:

  • modern urban and highway scenarios based on the InTAS traffic scenario
  • three heterogeneous driver profiles: normal, cautious, and aggressive
  • predefined training, validation, and test sets for reproducible ML-based evaluation
  • 15 attack types, including six newly introduced attacks
  • attacks affecting a broader range of attributes, including position, speed, acceleration, heading, and time
  • realistic sensor error models
  • ground-truth labels directly included in each message
  • an extensible dataset generator for adding future attacks, attributes, and entities such as Vulnerable Road Users (VRUs)

VeReMi NextGen Highlights

Advantage Description
More realistic traffic scenario Uses InTAS instead of LuST, providing more complex road layouts and more realistic traffic dynamics
Urban and highway coverage Includes both urban and highway scenarios with low and high vehicle densities
Heterogeneous driver behavior Introduces normal, cautious, and aggressive driver profiles
Broader attack diversity Provides 15 attack types affecting a wider range of message attributes
New attack types Includes six newly introduced attacks, such as position mirroring, zero speed report, reversed heading, feigned braking, and acceleration multiplication
ML-ready structure Provides predefined training, validation, and test sets
Easier evaluation Includes ground-truth labels directly in each message
Extensible design Provides a public dataset generator for adding new attacks, attributes, or entities
Future VRU support Based on InTAS, which supports Vulnerable Road User simulation and enables future dataset extensions
More challenging benchmark Evaluation results show that attacks in VeReMi NextGen are harder to detect than in VeReMi Extension

Citing VeReMi NextGen

@inproceedings{Hermann2026vereminextgen,
  author    = {Hermann, Artur and Remmers, Jan-Niklas and Eisermann, Dennis and Erb, Benjamin and Kargl, Frank},
  title     = {VeReMi {NextGen}: A {Dataset} for {Evaluating} {Misbehavior} {Detection} {Systems} in {VANETs}},
  booktitle = {2026 {IEEE} {Vehicular} {Networking} {Conference} ({VNC})},
  date      = {2026-06},
  location  = {Montreal, Canada}
}

VeReMi Extension (2020)

Code Dataset License

VeReMi Extension builds on the original VeReMi dataset and addresses several of its limitations. It provides larger simulated V2X message logs for evaluating misbehavior detection mechanisms.

The dataset was generated using the F2MD framework and is based on the LuST scenario. It introduces realistic sensor error models, two traffic density levels, and additional attack scenarios that affect not only position data but also other attributes such as speed.

Citing VeReMi NextGen

@inproceedings{02492739,
  title = {VeReMi Extension: A Dataset for Comparable Evaluation of Misbehavior Detection in VANETs},
  author={J. {Kamel} and M. {Wolf} and R. W. {van Der Heijden} and A. {Kaiser} and P. {Urien} and F. {Kargl}},
  booktitle = {2020 IEEE International Conference on Communications (ICC)},
  address = {Dublin, Ireland},
  year = {2020},
  month = {Jun}
}

VeReMi (2018)

Code Dataset License

VeReMi is the first version of the Vehicular Reference Misbehavior dataset family. It provides simulated V2X message logs with ground-truth labels for evaluating misbehavior detection mechanisms.

The dataset was generated using LuST and VEINS. It was designed as an initial city-scale baseline and includes three traffic density levels and five attack types focusing on position-based attacks.

Citing VeReMi

@inproceedings{van2018veremi,
  title={VeReMi: A Dataset for Comparable Evaluation of Misbehavior Detection in VANETs},
  author={Van Der Heijden, Rens W and Lukaseder, Thomas and Kargl, Frank},
  booktitle={International conference on security and privacy in communication systems},
  pages={318--337},
  year={2018},
  organization={Springer}
}

Comparison of the Different VeReMi Datasets

Features

Feature VeReMi VeReMi Extension VeReMi NextGen
Up-to-date traffic scenario
Multiple driver profiles
Multi-attribute attacks
Urban and highway scenarios
Sensor error models
Received Signal Strength Indicator
Ground-truth labels
Training/validation/test sets
Extensible design
Support for future VRU integration

Simulation Frameworks and Scenarios

Dataset Simulation Framework Traffic Scenario
VeReMi VEINS, OMNeT++, SUMO LuST
VeReMi Extension F2MD, VEINS, OMNeT++, SUMO LuST
VeReMi NextGen Eclipse MOSAIC, SUMO, OMNeT++ InTAS

Attack Coverage

Dataset Attack Coverage
VeReMi Basic attack set focused primarily on position manipulation
VeReMi Extension Extended attack set covering position, speed, timing, replay, DoS, and Sybil-based attacks
VeReMi NextGen Comprehensive attack set covering position, speed, timing, replay, DoS, and Sybil-based attacks, including six newly introduced attack types

This site uses Just the Docs, a documentation theme for Jekyll.