back

FPGA Acceleration of Likelihood Model based Adaptive Monte-Carlo Localization

Adaptive Monte Carlo Localization (AMCL) is a widely used algorithm in robotics for estimating a robot’s position within its environment. It relies on a probabilistic approach, utilizing a set of particles to represent different pose hypotheses. However, AMCL's computational demands make real-time execution challenging, especially on embedded systems.

In this project, I explored FPGA acceleration to improve the efficiency of AMCL’s particle filtering algorithm, particularly the likelihood model-based weight computation. The goal was to achieve better performance while ensuring compatibility with the ROS 2 framework.

The Problem

AMCL's computational complexity arises from continuously updating particle weights based on sensor data. This process involves:

On embedded platforms, such as those used in mobile robots, CPU-based execution often fails to meet real-time constraints. To overcome this, hardware acceleration using FPGAs offers a promising solution by leveraging parallelism and pipelining techniques.

To accelerate AMCL, I implemented a pipelined architecture for likelihood-based weight computation on the Kria KV260 FPGA platform. Using High-Level Synthesis (HLS), I designed a hardware kernel optimized for:

Results

Benchmarking the FPGA implementation against a CPU-based version of AMCL revealed a significant performance boost:

These results indicate that FPGA acceleration can enhance real-time performance for robotics localization tasks, making it a viable option for embedded systems with strict computational constraints.

My experience with HLS

This project was by far my most extensive experience with High-Level Synthesis (HLS). I found it to be a powerful tool for designing custom hardware accelerators, especially for computationally intensive tasks like AMCL. HLS allowed me to:

Of course, no technology is perfect. Here are some of the pain points I encountered while using HLS:

All in all, it was a fun experience. I loved the ability to prototype designs with ease, however RTL code is still the way to go for more complex designs. I’d love to see an RTL workflow that is as streamlined as HLS for initial kernel setup.

Read the whole thing

If you are interested in the details of the implementation, the design choices, and the results, you can read the full paper here. Feel free to reach out to me if you have any questions or suggestions regarding the project. I am always open to feedback and discussions. Thanks to the team at Acceleration Robotics for their support and guidance throughout this project.