Autonomous Rover
experimentalA mecanum-wheeled rover built to test one question: can an LLM fuse multiple camera feeds to navigate an indoor space — and skip the usual SLAM and LIDAR stack entirely?
My rover is a 4WD robot I designed and built — but really it’s a test bench for a bigger question: how much of the traditional autonomy stack can an LLM replace?
Most indoor robots navigate with SLAM (simultaneous localization and mapping) on ROS2. I want to find out if I can skip that path entirely and go straight to autonomous navigation — using cameras, an LLM, and the spatial reasoning that already lives in CorTex.
The inspiration
I saw a robot in a hospital making a scheduled drop-off of samples to the lab. Hospitals are busy, chaotic places — hallways full of people, carts, and commotion — and this thing was threading through all of it, safely and on schedule. It stuck with me: how is it doing this safely?
The question
Indoor robots usually lean on SLAM. But building a map of a space isn’t the same as understanding it. Self-driving has split into two camps chasing the harder version of this problem: LIDAR (measure the world in 3D) and vision (read the world the way a person does). Tesla famously bet on vision; others bet on LIDAR; some use both.
So here’s my version: can a rover skip the typical ROS2 / SLAM path and go straight to autonomous navigation — with an LLM as the thing that ties the sensors together?
The experiment
To test it without sinking months into it, I’m building a controlled experiment in my shop:
- A taped-off box on the floor — a known, bounded world to start from.
- An overhead camera piped through CorTex, which has vision capabilities — a stationary, god’s-eye view of the rover and the space.
- The rover’s forward-facing camera — the first-person view.
- (Planned) a LIDAR module for object detection, to put vision and LIDAR head to head.
The core test: can an LLM stitch together the forward-facing and overhead camera feeds to control the rover more precisely than either view could alone?
The hardware: mecanum wheels
I designed the rover with mecanum wheels, which give it some unusual moves — it can strafe sideways and crawl on a diagonal, not just drive and turn.
That freedom comes with a real tradeoff. In motion, mecanum wheels produce so much vibration that the onboard cameras can’t reliably run CV — fine when the rover is parked, but not while it’s navigating. That limitation is exactly why the overhead camera earns its place: it’s stationary, so it stays sharp while the rover moves.
Building it
I designed and built the rover from the chassis up — wheels, wiring, compute, and all. Here’s a bit of that process, including a moment that didn’t go entirely to plan.

Not every test drive ends with all four wheels still attached.
What’s next
The fun part will be testing the forward and overhead views in tandem in a real indoor environment.
I don’t have a LIDAR module on the rover yet. Maybe I’ll add one — or maybe vision and an LLM turn out to be enough, and I skip SLAM completely. That’s what the experiment is for.