Back to Community

Research note ยท Research

Crazyflow Drone Simulator

An Accurate, GPU-Accelerated, Differentiable Drone Simulator in JAX

Authors: Martin Schuck, Marcel P. Rath, Yufei Hua, Abhishek Goudar, Si Qi Zhou, and Angela P. Schoellig

Read the Crazyflow paper on arXiv

At first glance, Crazyflow is dominated by its performance numbers. The headline results are huge, so it can come across as a benchmark for raw simulator throughput. That misses the more useful idea -- the simulator is trying to make high volume simulation, analytical gradients, and real drone transfer feel like one workflow.

The paper Crazyflow: An Accurate, GPU-Accelerated, Differentiable Drone Simulator in JAX describes a JAX based simulator for aerial robotics. It supports detailed quadrotor dynamics, batched GPU execution, differentiable physics, Crazyflie family models, custom drone identification, and large swarm experiments.

That mix is valuable because drone autonomy work often bottlenecks on iteration. Teams need to generate rollouts, tune controllers, stress test risky behavior, train policies, and compare simulation against flight data. A simulator that can do those jobs quickly, while staying close to real hardware, changes the pace of development.

Paper details

Quick read

Crazyflow compiles drone physics, controllers, and learning loops into optimized JAX computation graphs. The simulator is built around batching, so it can run many virtual worlds, many drones, or many drone swarms in parallel.

The paper reports roughly 700 million simulation steps per second at one million parallel worlds on a consumer GPU. It also reports scaling to 4.2 million drones and more than 900 million steps per second. Those are post-compilation throughput numbers, so they should be read as optimized execution performance rather than end-to-end script startup time.

The simulator is also differentiable. The authors use that to train policies through backpropagation through time, run sampling-based control at high rate, and demonstrate real-world transfer on Crazyflie-family drones. The most memorable experiment is a thrown-drone recovery: a physical drone is thrown, a stabilization policy is trained from scratch in 0.38 seconds, the policy is transferred in about 48 ms, and the drone recovers before it falls.

Real flight testing still decides whether the model is good. Crazyflow's argument is that simulation can become much closer to the live engineering loop.

Why simulator speed is starting to matter

For a long time, simulation in drone work was mostly a risk-reduction tool. Write a controller, try it in simulation, fix the obvious failures, then move to hardware.

That workflow is still useful. Modern autonomy adds heavier demands. Reinforcement learning needs rollout volume. Sampling-based control needs many candidate futures before the next control tick. Swarm planning needs many agents in many scenarios. Automatic tuning needs repeated evaluation. Sim-to-real work needs logs, model updates, and validation runs.

Crazyflow is built for that world. It treats parallelism as a first-class design constraint rather than an afterthought. A single experiment might need one drone across a million reset states, a few thousand drones in one swarm, or many copies of the same swarm under different conditions.

When that kind of simulation becomes cheap, the questions get better:

  • How many candidate trajectories can we test before the next control update?
  • Can we tune a policy for the actual state we are seeing right now?
  • Can a swarm plan be checked against thousands of perturbed worlds?
  • Can a developer iterate on control logic in minutes instead of overnight?
  • Can a simulator become a synthetic data engine for drone autonomy?

Other projects are working on pieces of this problem. Crazyflow's value is the attempt to keep speed, differentiability, and sim-to-real accuracy in the same system.

What Crazyflow is

Crazyflow is a GPU-accelerated differentiable drone simulator written around JAX and XLA compilation. The implementation idea is direct: express the simulator as array operations, compile the full step function, and batch the state.

In the paper, simulation state is organized around two dimensions:

  • M worlds: parallel environments or rollout copies
  • N drones: agents inside each world

The same code path can handle one drone in many parallel environments, one large swarm, or many swarms at once. The authors call out this tensor layout because it explains much of the scaling behavior.

The simulator offers multiple levels of abstraction. The detailed model includes rigid body dynamics, motor dynamics, thrust dynamics, drag, and onboard controllers. The lighter model is identified from flight data and hides more of the inner control loop. A researcher can work at the motor level for one experiment and at the attitude or position interface for another.

The paper also provides CasADi versions of the physics models for optimization-based control. That keeps Crazyflow relevant for teams working with nonlinear model predictive control, trajectory optimization, symbolic dynamics, and neural policies.

Why JAX matters here

A lot of simulator performance comes down to execution model. Crazyflow's use of JAX is a core design choice.

In eager frameworks, operations execute step by step. That is flexible, but it can create overhead when a simulation loop contains many small operations. JAX traces the computation first. XLA then compiles the traced function into optimized kernels. In a robotics simulator, physics updates, controller logic, reward computation, and parts of the training loop can be fused into a tighter execution path.

For drone learning, the payoff is straightforward.

The first payoff is lower overhead. The simulator can run millions of worlds without repeatedly bouncing between Python and the accelerator.

The second payoff is analytical gradients. Instead of estimating every signal through sampling, an optimizer can differentiate through simulated dynamics and controller logic. The paper reports nine million gradients per second for one million environments, each differentiating through ten simulation steps.

That gradient path is one of the paper's strongest technical points. The simulator can run forward quickly, and optimization code can also see through the dynamics.

The core architecture

Crazyflow splits its modeling into two practical tracks.

The first is a first-principles model. This is the high-fidelity path. It models the quadrotor as a six-degree-of-freedom rigid body and includes rotor dynamics, thrust curves, motor effects, drag, and a reimplementation of Crazyflie-style onboard control in JAX. The paper uses this path for the tightest Crazyflie sim-to-real results.

The second is an abstracted model. It hides some of the inner control loop behind fitted dynamics at a higher interface. The paper describes a lightweight system identification process that fits the model from a few minutes of flight data. This is the route for custom drones and experiments where full motor-level modeling would be too much setup.

That split feels practical. Full fidelity is useful when the physical details matter. An abstracted model is easier to fit and often enough for planning or mid-level control. Crazyflow gives researchers a way to choose instead of forcing every experiment into the same model.

The simulator also connects to MJX for geometry-related work such as collision checks, ray casting, depth sensing, and visualization. It is mainly a free-flight dynamics simulator, with enough environment support for obstacle and sensing experiments.

Benchmark numbers to notice

The paper reports many benchmarks. These are the results that matter most for people evaluating Crazyflow as a drone simulator.

AreaResult reported in the paperWhy it is useful
Parallel simulationAround 700 million steps per second at one million worldsLarge-scale RL and parameter sweeps become much cheaper
Maximum scalingUp to 4.2 million drones and over 900 million steps per secondThe simulator was built around batching from the start
Swarm simulationThousands of drones across thousands of worldsSwarm planning, choreography, and multi-agent RL can be tested at larger scale
DifferentiationNine million gradients per second through ten-step simulationsGradient-based policy optimization becomes more practical
Depth rendering350k frames per second at 64x64 resolution across 1,024 environmentsObstacle and sensing experiments can stay batched
MPPI control500k sampled trajectories at 50 HzSampling-based control can use the full nonlinear model
Throw recoveryPolicy trained in 0.38 s, then transferred in about 48 msA real-time constraint becomes part of the experiment

These numbers need the usual JIT caveat. The paper measures execution after warmup. That is the right way to evaluate compiled throughput, but teams should still account for compilation time and memory use in their own scripts.

Even with that caveat, the gains are large. In several comparisons, the paper reports order-of-magnitude or larger improvements over tools such as gym_pybullet_drones, Aerial Gym, and DiffAero.

Sim-to-real is the hard part

Fast simulation is only valuable if the simulated drone teaches the right lesson.

Crazyflow puts real-to-sim modeling close to the center of the paper. The authors validate the simulator against real Crazyflie-family drones and compare the sim-to-real gap with gym_pybullet_drones and CrazySim. For the Crazyflie 2.1 first-principles model, the paper reports centimeter-level root mean square position error on circle and Lissajous trajectories. On the 10-second Lissajous task, Crazyflow reports 10.7 +/- 2.7 mm error, compared with 20.4 +/- 1.5 mm for gym_pybullet_drones and 62.1 +/- 1.3 mm for CrazySim.

The abstracted model also performs well. The paper reports sim-to-real error in the centimeter range across Crazyflie variants and a custom 660 g drone, with the model fit from a short flight dataset.

Many drone simulators quietly break down at this point. They look plausible, but a controller trained inside them can learn details that do not exist on the physical aircraft. Crazyflow's sim-to-real results are what make the speed claims useful rather than merely impressive.

Why differentiability changes the workflow

Differentiable simulation is often described in abstract terms, but the practical value is straightforward -- gradients make some searches less wasteful.

A sampling-only optimizer has to discover which changes helped by trying many rollouts. That approach is useful, especially for discontinuous costs or messy environments. A differentiable simulator adds another tool: exact sensitivity information. The optimizer can see how a small change in policy, action, or model parameter affects future behavior.

Crazyflow uses that capability in backpropagation through time. The paper trains trajectory-tracking policies with analytical gradients and reports real-world deployment without domain randomization. On the slow Lissajous tracking task, a BPTT policy reaches 8.7 +/- 0.1 mm tracking error after 1.56 seconds of training. PPO also performs well, reaching 15.9 +/- 0.2 mm after 1.61 seconds.

The result should stay in context. It does not mean every drone policy can be trained in seconds. The task, platform, model, and lab setup matter. It does show how quickly the workflow changes when simulation speed and gradient access arrive together.

The 0.38 second throw demo

The thrown-drone recovery is the experiment people will remember.

The authors throw a physical drone, estimate the takeover state, train a motor-level stabilization policy in Crazyflow using BPTT, transfer the policy, and stabilize the drone before it hits the ground. Training takes about 0.38 seconds. Policy transfer takes about 48 ms.

The demo puts pressure on the usual train-then-deploy workflow. In a normal robotics setup, training happens ahead of time and deployment tests whether the policy generalizes. Here, the simulator trains for a specific observed state while the event is unfolding.

There are caveats. The paper notes that success is constrained by the physical room and throw height. The policy is trained from the predicted takeover state, not from a magical full-world understanding. It is still a controlled demonstration.

The implication is still sharp. An accurate, differentiable, fast simulator can become part of online recovery, adaptation, or replanning.

Where this fits with Nimbus and DroneForge

For DroneForge builders, Crazyflow points toward a tighter autonomy workflow.

Nimbus and DF1 are about making real drone behavior accessible: video, telemetry, route planning, object tracking, control logic, and repeatable field tests. A simulator like Crazyflow does not replace that hardware loop. It can make the loop tighter.

A practical developer workflow might look like this:

  1. Prototype a controller or policy in simulation.
  2. Stress-test it across many randomized worlds.
  3. Use real flight logs to identify or refine a model.
  4. Re-run the simulation with the updated dynamics.
  5. Deploy the candidate behavior through the real drone stack.
  6. Compare telemetry, video, and route performance against the simulated expectation.

The bridge is the feedback loop. Simulation gives speed and scale. Hardware gives truth. The better the loop, the faster a developer can move from an idea to tested flight behavior.

Crazyflow is also relevant for swarms. DroneForge users may not all be flying thousand-drone formations, but the same scaling ideas apply to search patterns, coverage planning, repeated inspection routes, and multi-agent task allocation. If a simulator can cheaply test many drones across many worlds, developers can explore more ambitious behavior before taking it outside.

What Crazyflow is not

Crazyflow is not a complete answer to every drone simulation problem.

The paper is focused on dynamics, control, differentiability, and large-scale parallelism. It includes depth rendering and collision-related capabilities through MJX, but photorealistic camera simulation is listed as a future direction. If a project needs high-end visual realism, weather, RF modeling, battery aging, detailed sensor artifacts, or full autopilot stack emulation, Crazyflow may need to be paired with other tools.

The strongest results are also tied to the platforms and setups studied in the paper. Crazyflie-family support is a major strength. Custom drones are supported through system identification, but that still requires good data, a tuned real platform, and careful validation.

There is also a practical limitation around compilation. JAX can be extremely fast once a function is compiled, but compilation time and memory behavior matter in real projects. The paper's raw throughput numbers are best understood as optimized execution performance, not as a guarantee that every developer script will feel instant.

Those limits keep the claim in the right category. Crazyflow is a serious simulator for drone dynamics, learning, control, and swarms. It is not a universal digital twin for every aerial robotics condition.

What builders should watch

The next question is adoption.

If Crazyflow stays easy to install, easy to extend, and easy to connect to real flight logs, it could become a strong default for drone RL and differentiable control research. The open-source angle matters. The paper says the project contributes back to SciPy, Crazyflie firmware, Gymnasium, and array API tooling. Simulator ecosystems live or die by maintenance, examples, and community use.

The highest-leverage future additions would be broader controller support, especially PX4, Betaflight, and ArduPilot-style pipelines; richer camera simulation; better examples for custom drone identification; and clear recipes for moving from logs to validated models.

For teams building drone autonomy products, the lesson is already clear. The simulator is no longer just a mock environment. It is becoming part of the development engine.

Bottom line

Crazyflow is worth reading because it connects several trends that usually get discussed separately: GPU batching, differentiable programming, sim-to-real accuracy, open-source drone models, reinforcement learning, model predictive control, and swarms.

The paper's best contribution is the practical combination. It shows a drone simulator with enough throughput for massive rollouts, enough fidelity for real deployment on Crazyflie-class hardware, and enough structure for gradient-based training and optimization.

For drone autonomy work, that combination changes what feels reasonable to try. Experiments that used to be too slow, too expensive, or too risky can move earlier into the development cycle.

Research context

The DroneForge research section collects practical notes for builders who want to connect drone autonomy ideas to real hardware. Topics may include perception, tracking, mission planning, route replay, benchmarks, datasets, and lessons from operating Nimbus with DF1 in repeatable field workflows.

These notes are written for developers who need more than abstract robotics theory. The goal is to connect papers, experiments, and field observations to concrete Nimbus App and Python Library workflows that can be tested with video, telemetry, commands, and route planning tools.

As this section grows, each research entry will point builders toward the assumptions, constraints, and practical tradeoffs behind real autonomy experiments. That context helps teams decide what to prototype, what to measure, and how to evaluate progress.

Community archive

Continue exploring DroneForge changelogs, research notes, and Nimbus examples through the community archive. These internal links help connect related releases, technical notes, and builder resources.