CADET: A Modular Platform for Evaluating Distributed Inference and Cooperative Autonomy in Connected Autonomous Vehicles

Motivation

Safety outcomes depend on the joint design of models, placement, and networks

The Gap

Existing AV platforms remain monolithic in deployment, executing entire pipelines on single onboard computers. V2X simulators model networks or modularity, but overlook cooperative autonomy as a distributed systems problem requiring compute platform heterogeneity and orchestration.

CADET's Approach

CADET disaggregates the AV stack into composable modules deployable across vehicles, RSUs, edge nodes, and cloud servers. NetWaggle provides reproducible network emulation with device heterogeneity, enabling systematic exploration of distributed execution strategies.

What CADET Enables

CADET allows systematic exploration of deployment topologies (V2V, V2I, cloud, edge) while isolating the impact of network characteristics (latency percentiles, jitter), compute resources (GPU architecture, multi-tenant contention), and model selection (accuracy-latency tradeoffs). Multi-level instrumentation traces causality from design choices to safety outcomes, revealing hidden failure modes invisible in single-tier evaluations.

Platform Comparison

CADET integrates platform heterogeneity, orchestration, system-level metrics, and modularity

Platform	V2X	Platform Heterogeneity	Network Realism	Orchestration	System-Level Metrics	Modularity
Plexe	●	○	●	○	○	●
Eclipse MOSAIC	●	○	●	●	●	●
V2XVerse	●	○	●	○	○	●
OpenCAMS	●	○	●	●	○	●
OpenCDA	●	○	●	●	○	●
VaN3Twin	●	○	●	○	○	●
CADET [ours]	●	●	●	●	●	●

Abstract

Deep learning models are increasingly central to autonomous vehicle (AV) pipelines, yet their integration has traditionally followed a monolithic design where perception, planning, and control execute on a single onboard computer. This design overlooks the emerging paradigm of cooperative autonomy, where vehicles interact with roadside units (RSUs), edge servers, and cloud-hosted intelligence through vehicle-to-everything (V2X) connectivity. Cooperative perception and control improve safety and efficiency, but also introduce systems-level challenges: network latency, compute heterogeneity, and multi-tenant contention, all critically affect real-time decision-making. These challenges are further amplified by the increasing reliance on large foundation models, whose scale necessitates cloud deployment. We present CADET (Cooperative Autonomy through Distributed Experimentation Toolkit), a modular platform for systematic and reproducible evaluation of distributed cooperative autonomy systems under realistic deployment conditions. CADET decouples the AV stack into composable modules that can be flexibly deployed across vehicles, infrastructure, and edge/cloud tiers. The framework integrates state-of-the-art models, incorporates realistic network and workload emulation, and provides synchronized model-, system-, and task-level instrumentation. Through V2V and V2I experiments, we show that distributed deployment choices fundamentally shape safety, with V2V intent packets outperforming cloud-based perception and RSU-assisted perception sustaining safety until overloaded by concurrent requests. Although designed for AV pipelines, CADET also supports dataset-driven experimentation, enabling systems and ML researchers to benchmark distributed inference workloads independently of full vehicle simulation. CADET is open source, with code and demo available at https://anonymous.4open.science/w/cadet-site-C823/.

Platform Capabilities

Modular Pipeline Architecture

AV pipeline decomposed into Perception and Cognition stages with independently configurable sub-stages. Supports flexible deployment of state-of-the-art models (YOLO, MPC) and cooperation protocols across vehicle, RSU, edge, and cloud.

NetWaggle: Network Emulation

Mininet-based network layer injects realistic V2X delay distributions, jitter patterns, packet loss rates, and bandwidth caps matching measured 4G-LTE, 5G-NR, DSRC, or Wi-Fi traces. Low overhead (<10ms) ensures repeatability.

Device Heterogeneity

Client-server architecture supports deployment across embedded vehicle devices, GPU-equipped RSUs, and cloud servers. Asynchronous execution mode mimics real-world V2X timing behavior. Hardware-in-the-loop testing enabled.

Multi-Source Data Integration

CARLA Simulator with Scenario Runner for configurable V2X scenarios. Dataset Mode for deterministic replay on Waymo, Argoverse, COCO, VQA2. Live Sensor support for physical hardware integration.

Orchestration and Batch Execution

Single YAML configuration file specifies scenario parameters, model variants, deployment mappings, and network traces. SSH-based remote deployment. Systematic sweeps over vehicle density, delay profiles, or workloads.

Multi-Level Instrumentation

Synchronized metrics across model-level (mAP, inference latency), system-level (CPU/GPU utilization, memory, energy), and task-level (time-to-collision, braking efficiency). Timestamped logs with NTP synchronization.

CADET Architecture

CADET implements a four-layer architecture that establishes clear abstraction boundaries while maintaining flexibility for diverse cooperative autonomy research scenarios.

CADET System Architecture — CADET implements a modular four-layer architecture allowing components to be evaluated in isolation or integrated for end-to-end cooperative autonomy experiments.

The Data Source Layer provides scenario generation via CARLA Simulator, dataset-driven evaluation, and live sensor integration. The Processing Core decomposes the AV pipeline into Perception and Cognition stages, each with independently configurable sub-stages deployable across the computational continuum. NetWaggle combines device heterogeneity with reproducible network emulation, supporting deployment from embedded vehicle-like devices to GPU-equipped RSUs and cloud servers. The Analysis Layer provides unified orchestration, synchronized multi-level instrumentation (model, system, task), and foundation model interfaces.

NetWaggle: Standalone or Integrated

CADET components can be used together as an end-to-end platform or individually for targeted research. NetWaggle, CADET's network emulation layer, is available as a standalone tool for distributed systems research beyond autonomous vehicles.

Standalone NetWaggle Use

Network researchers can evaluate protocols under realistic V2X conditions without AV simulation
Systems researchers can benchmark distributed inference across heterogeneous devices
ML researchers can study model behavior under network-induced delays and jitter

Integrated CADET Use

NetWaggle seamlessly integrates with CADET's Processing Core and Data Sources to enable end-to-end cooperative autonomy experiments with synchronized multi-level instrumentation.

NetWaggle Architecture — NetWaggle network emulation with device heterogeneity across cloud, edge, and embedded platforms.

Evaluation

Distributed deployment choices fundamentally shape cooperative autonomy safety outcomes

V2V Cooperative Braking
V2I RSU-Assisted Perception
Foundation Models

V2V Cooperative Braking

Two-vehicle platoon scenario where the leader executes emergency braking and the follower must maintain safe inter-vehicle distance. Safety threshold set to 10m (approximately two car lengths). We compare V2V intent communication (DSRC, 20ms baseline, 50ms tail) against cloud-hosted YOLO11x perception (5G, 30ms baseline, 100ms tail).

Result: V2V intent packets remain collision-free across all delay profiles. Cloud perception fails at p90 and p99 due to compounding delays (30-50ms inference + 100ms tail latency). Larger models impose higher energy costs (up to 1.66J per inference) without improving safety under adverse network conditions.

V2V Gap Analysis — Leader-follower gap over time at different speeds under perception-only and V2V-based braking. Safety violation occurs when gap ≤10m.

Collision Outcomes Across Network Delay Profiles

Policy	Model	Energy (J)	p50	p80	p90	p99
V2V	—	—	✓	✓	✓	✓
Perception (Cloud)	YOLO11n	0.73	✓	✓	✗	✗
	YOLO11s	0.76	✓	✓	✗	✗
	YOLO11m	0.92	✓	✓	✗	✗
	YOLO11l	1.27	✓	✓	✗	✗
	YOLO11x	1.66	✓	✓	✗	✗

✓ No Collision ✗ Collision

V2I-Assisted Braking

Ego vehicle travels at 20mph when a pedestrian emerges from behind an occluding truck. Safety margin defined as the difference between time-to-collision (TTC) and time-to-event (TTE). We evaluate four configurations: onboard perception, cloud-hosted perception, RSU-local V2I, and RSU-cloud V2I under varying visibility conditions and concurrent client loads.

Result: Perception-only policies achieve sufficient safety margins under clear visibility but deteriorate rapidly as conditions worsen. RSU-local and RSU-cloud V2I maintain positive margins across all visibility settings. RSU inference latency exceeds 100ms with 5-10 extra clients. Cloud p99 latencies exceed 200ms with 10 clients due to queuing under concurrent load.

V2I Latency — End-to-end latency distribution under increasing concurrent load (0, 5, 10 extra vehicles) for PC and RSU deployments.

Safety Analysis — Safety margin (TTC-TTE) across visibility conditions. Positive values indicate safe operation.

Dataset Mode with Foundation Models

CADET's dataset mode enables evaluation of vision-language models independently of full vehicle simulation. We evaluate ten VLMs across three GPU servers (NVIDIA A6000, A100, A16) using the VQA2 dataset as a proxy for cooperative autonomy scenarios where sensor data is queried in natural language.

Result: Inference latencies exceed 200ms even for smaller models, far beyond the deadlines of safety-critical coordination. 13B-scale models require >25GB GPU memory and deliver throughput below 10 tokens/s, while 3B-7B models sustain >12 tokens/s with manageable memory footprints. Although semantic queries are promising for higher-level coordination, current models are too slow and resource-intensive for low-level safety decisions.

VLM Accuracy-Latency — Accuracy-latency tradeoffs for VLMs across heterogeneous GPU servers (A6000, A100, A16). Models and devices exhibit significant variation in the accuracy-latency tradeoff space.