The Gap Between Screen Space and Physical Space
To understand why spatial AI matters, consider what it actually means for a machine to understand physical space. Humans navigate physical environments effortlessly because we have evolved sensory systems — vision, proprioception, depth perception, spatial memory — that give us a continuous, three-dimensional understanding of our surroundings. We know without thinking where we are in a room, how far we are from objects, what's behind us, and how to get from here to there.
For a robot or AI system, none of this is given. A camera provides a 2D image. LiDAR provides a point cloud. Sensors provide readings. But turning all of that into a coherent, actionable understanding of "I am at position X in this known space, the item I need is at position Y, and the optimal path between them is Z" — that requires a spatial intelligence layer that most AI systems currently lack or handle very poorly in novel environments.
The consequence is that even the most capable AI models — the ones that can write code, analyze complex documents, and reason through difficult problems — cannot directly tell a robot to "go to aisle 7, shelf 3, third item from the left" in a physical space they haven't been specifically programmed for. Spatial AI is building that capability.
What Is Spatial AI?
Spatial AI encompasses the technologies and systems that give machines the ability to understand, represent, and reason about physical space in three dimensions. It draws on several component capabilities: localization (knowing precisely where you are in a space), mapping (building a representation of the space), object recognition (identifying what things are and where they are), and spatial reasoning (understanding relationships between locations and planning paths through space).
SLAM (Simultaneous Localization and Mapping) has been the traditional robotics approach — a robot builds a map of its environment as it moves through it. This works but is slow (every robot in every new environment starts from scratch), computationally expensive, and produces maps that aren't easily shared between robots or systems.
The newer generation of spatial AI takes a different approach: building persistent, shareable spatial representations that any compatible device can access — the equivalent of downloading a map rather than generating one from scratch. This is the core insight behind digital twins and spatial computing platforms like Auki's posemesh.
Auki Labs and the Posemesh: A New Nervous System for Physical Space
Auki Labs, founded in 2019 and headquartered in Hong Kong, has built what it describes as the "posemesh" — a decentralized network designed to function as a shared spatial intelligence layer for the physical world. The name combines "pose" (a robot's position and orientation in space, expressed as 3D coordinates) with "mesh" (a network connecting many nodes).
The posemesh works by allowing devices — smartphones, robots, AR headsets, IoT sensors — to collectively build and share precise 3D spatial maps of physical environments. These maps, called "domains," are created using data captured by smartphone sensors (a deliberate design choice that makes map creation accessible without specialized hardware) and stored as precise X, Y, Z coordinate systems that any compatible device can reference.
The practical applications are immediately apparent. A robot entering a warehouse for the first time can download the warehouse's domain — a pre-built 3D map with the coordinates of every shelf, aisle, and pick location — and immediately navigate it with precision, rather than spending time mapping the environment from scratch. An AR-equipped warehouse worker receives navigation instructions based on precise spatial coordinates rather than approximate verbal directions. A fleet of robots can share spatial awareness with each other in real time, coordinating movement without collision.
Auki has described this as transforming smartphones into "robots without legs" — devices that can navigate, reason about, and interact with real-world physical layouts in ways that weren't possible before. In retail, they've demonstrated systems where shelf locations are translated into X, Y, Z spatial coordinates, so a robot or AR worker can navigate directly to a specific item without any ambiguity about where it is.
The Decentralization Advantage
Auki's choice to build on a decentralized architecture rather than a centralized cloud service is deliberate and significant. Centralized spatial mapping — the approach taken by Google Maps, for instance — requires uploading camera data to central servers, creating significant privacy concerns, bandwidth requirements, and latency issues that make it unsuitable for many real-time applications.
The posemesh allows devices to exchange spatial data directly with each other, in encrypted form, without sharing camera feeds with centralized entities. A warehouse can build and maintain a spatial map of its facility without that spatial data leaving the facility's network. A hospital can have precise spatial mapping of its floor plans without uploading that data to external servers. This privacy-by-design architecture makes spatial AI deployable in environments where centralized cloud approaches would face prohibitive security or compliance barriers.
There's also a latency advantage: real-time robot navigation and AR applications require responses in milliseconds, not the seconds that cloud round-trips can introduce. Edge-based spatial computing eliminates that latency by keeping computation close to the devices that need it.
Digital Twins: The Parallel Approach
Alongside Auki's network approach, digital twin technology represents another path to spatial AI. A digital twin is a precise virtual replica of a physical environment — a 3D model that reflects the real-world space in enough detail that simulations, planning, and operational decisions made in the digital twin accurately reflect what would happen in the physical space.
Digital twins have been used in industrial settings for years — aerospace companies use them to simulate manufacturing processes, energy companies use them to model power plant operations, urban planners use them to model city systems. What's changing in 2026 is the combination of digital twins with AI that can reason about and interact with the spatial model.
Rather than just visualizing a factory layout in 3D, an AI system connected to a factory's digital twin can optimize robot routing in real time, identify bottlenecks before they occur, simulate the effects of layout changes before physically moving equipment, and coordinate multiple autonomous systems across the space. NVIDIA's Omniverse platform is the most prominent attempt to build an industrial-grade digital twin infrastructure — a physics-accurate simulation environment that robotics companies use to train robot behaviors before deploying them in the real world.
The Connection to Embodied AI
Spatial AI and embodied AI are deeply connected — in fact, spatial intelligence is one of the core missing capabilities that limits today's embodied AI from achieving reliable real-world deployment. A robot that can't precisely understand where it is in space, where objects are relative to it, and how to navigate to them is limited to environments it has been explicitly programmed for.
Spatial computing infrastructure like Auki's posemesh is effectively addressing one of the key challenges in the embodied AI stack — not by making robots smarter, but by making the environment machine-readable. If every physical space has a precise, downloadable spatial map, robots don't need to spend computational resources solving the localization and mapping problem from scratch every time they enter a new environment. They can focus those resources on the actual task.
This is why Auki appears in our overview of the embodied AI ecosystem as an infrastructure enabler rather than a robot manufacturer — it's building the spatial layer that all the robot companies need, regardless of which physical form their robots take.
Real-World Deployments: Where This Is Already Working
Healthcare robotics navigation. Hospitals are deploying robots for medication delivery, equipment transport, and supply runs. These robots need to navigate complex, dynamic environments — floors that change configuration, humans moving unpredictably, elevators, access-controlled areas. Precise spatial mapping dramatically reduces the navigation failures that make early hospital robot deployments frustrating to maintain.
Retail and warehouse operations. The combination of AR-equipped workers with spatial mapping produces measurable efficiency gains in pick-and-pack operations, inventory management, and restocking. Workers receive precise navigation to the exact location of items rather than approximate directions, reducing pick errors and training time for new employees.
Construction and facility management. Digital twin-based spatial intelligence is being used to track equipment location, optimize worker routing, monitor construction progress against plans, and manage the complex logistics of large construction sites where dozens of teams and hundreds of pieces of equipment operate simultaneously.
Augmented reality applications. Precise spatial anchoring — the ability to place digital information at exact physical locations that remain stable as the user moves — requires exactly the kind of persistent spatial mapping that technologies like Auki's posemesh provide. Without it, AR overlays drift and fail to remain attached to their intended physical reference points.
The Long-Term Vision
Auki describes its goal as becoming "the decentralized nervous system of AI in the physical world, providing collaborative spatial reasoning for the next 100 billion devices on Earth and beyond." That's an ambitious statement, but the underlying logic is sound: as AI-enabled physical devices proliferate — robots, autonomous vehicles, AR systems, smart sensors — the need for a shared spatial intelligence infrastructure becomes more acute, not less.
The analogy to GPS is instructive. Before GPS, navigation required individual effort — paper maps, asking for directions, local knowledge. GPS created a shared spatial reference system that any device could access, transforming navigation from a skill into an infrastructure service. Spatial AI is attempting something similar for the machine-readable understanding of physical space — moving from each robot solving the spatial understanding problem individually to a shared infrastructure that any device can use.
Whether Auki's posemesh, NVIDIA's Omniverse, or some other approach ultimately becomes the dominant spatial AI infrastructure remains to be seen. But the need it addresses is real and growing — and the companies solving it are working on one of the most consequential technical challenges in the entire AI landscape.
For more on the companies racing to solve the physical AI challenge, see our full overview of the embodied AI landscape. For the broader context of specialized AI beyond the physical world, see our showcase of the most impressive industry-specific AI tools.