From Centralized Brains to Edge Intelligence: Rethinking Compute Architectures for Autonomous Mobile Robots

Autonomous mobile robots (AMRs) are self-navigating robotic systems built to operate without human intervention in warehouse logistics environments. They rely entirely on onboard sensors, real-time processing, and AI to interpret their environment and make autonomous decisions, and they have outgrown traditional ‘one big CPU’ mindsets.

For decades, the brains of AMRs followed a familiar pattern: route all sensor data including camera feeds, laser range readings, inertial measurements, and more, into one powerful central processor. That processor would handle everything from SLAM (Simultaneous Localization and Mapping), which constructs a map of the robot’s environment while simultaneously estimating its own position within that map, to obstacle avoidance to motion control.

The model worked well in lab prototypes and early deployments, but as AMRs scaled into fleets and made their way into the real world, a single CPU approach started to show its limitations. High latency, inefficient power use, and compute bottlenecks are all symptoms of a centralized design trying to do too much.

Today’s AMRs are navigating unpredictable warehouse aisles, adapting to real-time sensor feedback, and running machine learning inference on the fly – meaning developers can ill afford inefficient systems. This is why developers are increasingly moving away from centralized architectures towards distributed compute architectures that push perception and control functions closer to the sensors themselves.

Moving on from centralized architectures

In a centralized model, real-time tasks must compete for processing time on the same CPU. This not only increases latency but also reduces determinism – an issue for tasks like motor control that require sub-millisecond responsiveness.

Scaling also becomes inefficient. Doubling the number of sensors or actuators often means doubling compute load in one place. Over-provisioning for worst-case scenarios bloats system cost and power use, while under-provisioning risks performance drops under load. These trade-offs are especially acute in battery-powered platforms where every watt counts.

Moreover, centralized designs are harder to modularize. Adding a new sensor or updating a subsystem often requires requalifying the central firmware stack and rebalancing compute resources.

The rise of edge intelligence

By embracing distributed compute architectures and offloading preprocessing and inference tasks to embedded processors or microcontrollers within subsystems – such as vision modules, LiDAR arrays, and motor controllers – robots gain speed, efficiency, and modular scalability.

This evolution isn't just about swapping CPUs. It’s a fundamental rethink of system architecture that aligns hardware and software more closely with real-world demands. Perception is just one area that could benefit. A traditional AMR might stream high-resolution RGB or depth images to a central processor, which then runs object recognition or SLAM algorithms. This incurs heavy data transmission costs and latency.

Edge compute nodes can handle feature extraction, depth estimation, and sometimes even AI inference locally, delivering compact semantic data rather than raw imagery. This shift means the central processor doesn’t need to be a power-hungry beast. It can focus on high-level planning and coordination, while edge nodes perform time-critical, localized operations. The result? Faster reactions, lower energy consumption, and fewer thermal constraints.

Distributed compute in practice

Distributed architectures are built around autonomy at the edge. In a typical setup, the AMR’s vision subsystem might include a stereo camera with an onboard processor that performs depth estimation and object detection locally. LiDAR (Light Detection and Ranging) systems might preprocess point clouds before sharing with the navigation engine. Inertial Measurement Units (IMUs) feed real-time pose updates directly to control loops, bypassing the central path entirely.

Key workloads such as SLAM, navigation planning, and obstacle avoidance still require integration, but they increasingly draw on already-processed data. This reduces central processing needs and simplifies integration across diverse hardware configurations.

Microcontrollers also play a bigger role. In motor control, for example, closed-loop feedback must be processed with microsecond-level determinism. By embedding real-time control in dedicated MCUs near the drive systems – often using industrial protocols like EtherCAT – developers gain tight motion precision without loading the main processor.

Benefits beyond performance

Distributed compute offers several advantages beyond speed and responsiveness. Power efficiency is one. By only activating the processors needed for a task, and allowing idle units to sleep, AMRs can significantly extend operational uptime, boosting charging infrastructure, maintenance cycles, and total cost of ownership.

Reliability is another benefit. Subsystems with local intelligence can continue operating even if the central processor is busy or faulty. A battery management system with its own MCU, for example, can still protect the battery pack during a software crash elsewhere. Likewise, safety-critical functions such as collision avoidance can also be isolated.

Distributed architectures also enable modularity. Developers can swap out subsystems with minimal impact on the rest of the stack. This accelerates iteration, simplifies reuse across platforms, and enables AMRs to be tailored for niche environments or tasks. Modularity can also benefit building up more complex systems on top of the AMR platform, such as dual-arm robot systems.

The software challenge

Of course, edge intelligence isn’t without cost, and it introduces integration complexity. Distributing tasks means synchronizing multiple processors, managing interconnect latency, and maintaining software consistency across heterogeneous hardware.

Here, middleware like ROS 2 (Robot Operating System) provides a critical abstraction layer. ROS 2’s publish-subscribe model supports distributed systems natively, enabling sensor nodes, control units, and AI modules to communicate without tightly coupling their implementations.

To make it all work, software stacks must be co-designed with hardware constraints in mind. Developers must optimize inference models for resource-limited MCUs or NPUs, carefully manage thermal budgets, and ensure that latency-sensitive tasks remain deterministic.

Laying the foundations for more efficient robots

With AMRs designed to complete a huge range of tasks in differing environments, there’s naturally no one-size-fits-all answer. Centralized designs also still have a role in low-complexity AMRs or constrained-cost deployments. A robot with minimal sensing or pre-programmed routes may benefit from a single, integrated compute platform, for example.

But for scalable, responsive AMRs operating in dynamic environments, the future is unmistakably at the edge. Engineers must weigh design trade-offs between latency, power, scalability, and integration complexity, which are choices likely to be shaped by application and scale.

What is clear is that yesterday’s compute architectures, however elegant in their simplicity, no longer meet the demands of modern autonomy. From perception to actuation, distributed intelligence is becoming the foundation for agile, efficient, and mission-ready mobile robots.

Dr.-Ing. Nicolas Lehment leads NXP’s robotics team and oversees the industrial system innovation board. Before joining NXP, he designed cutting-edge computer vision and robotics systems for ABB and Smartray. He has collaborated on research papers for topics ranging from ML-driven video classification over human pose tracking to collaborative robotics. This academic work earned him a doctoral degree at the Technische Universität München.