The Internet of Moving Things

Overview

I came across a video discussing how platforms like Uber handle massive streams of real-time data simultaneously. It sparked a deep curiosity: how do these systems scale across different industries?

For most users, the ‘blue dot’ moving across a map is something taken for granted. However, from an engineering perspective, implementing this without the right architecture is a recipe for disaster. Without a strategy for spatial indexing and data ingestion, you risk overloading your backend or, just as critically, bottlenecking your front-end performance.

The Scale of the Problem

All these systems need to ingest data from various telemetry sources and yet somehow be able to show the data to users in real time. It's typically expected that this data is frequently written, and It's typically expected that this data is frequently written—often every 4 to 6 seconds per active entity, and simultaneously read by millions of "eyeball" services. This creates a classic distributed systems bottleneck: a high-velocity, write-heavy stream that must remain consistent enough for dispatching, yet fast enough to render on a mobile UI without jitter.

Consider the math: If a platform has 1 million active drivers updating their location every 5 seconds, the system must handle 200,000 writes per second. Standard relational databases aren't built for this level of IOPS (Input/Output Operations Per Second) without significant sharding and overhead.

Beyond the raw numbers, we have the "Urban Canyon" effect. GPS data isn't a clean stream of truth; it’s noisy. Signals bounce off skyscrapers, making a car on a highway look like it just jumped into a river. To show a smooth "blue dot" to a user, the system must not only ingest this data but also "snap" it to a known road network in real-time—a process known as Map Matching.

Evolution from Monolith to DOMA: Scaling the Search

Uber started off as a monolith which was written in Python and backed by a PostgreSQL database, which would be sufficient for a single city, but would struggle for regions, countries, and onwards.

This system had to evolve into a Service-Oriented Architecture (SOA), and afterwards a Domain-Oriented Microservice Architecture (DOMA).

As part of this architecture, searching is not invoked on a single service, but a collaboration of different services.

Supply Domain: Manages the state and location of assets.
Demand Domain: Manages user intent and session state.
Matching Domain: The algorithmic brain that pairs supply and demand.
Maps Domain: Provides routing, ETA, and geospatial context.

This separation of concerns allows for independent scaling. The Supply service can handle the write-heavy load of GPS ingestion without blocking the read-heavy Matching service. These domains communicate via high-performance Remote Procedure Calls (RPC), specifically utilizing gRPC and TChannel to minimize overhead.

The Geometry of Search: Hierarchical Spatial Indexing

At the core is the ability to answer one question instantly: "Who is near me?" Traditionally, you might think of querying a database for latitude and longitude. However, raw coordinates are computationally expensive. Calculating the distance between two points on a sphere (the Haversine formula) for millions of entities is an O(N) operation that would melt most servers at scale.

The Failure of Squares and Quadtrees

Early spatial systems used Quadtrees (dividing the world into squares) or Geohashes. While simpler to visualize, squares have two major geometric flaws for moving assets:

Shape Distortion: Because the Earth is a sphere, square cells near the poles are shaped differently than those at the equator.
Neighbour Asymmetry: A square has two types of neighbors: those sharing an edge and those sharing a corner. The distance to a diagonal neighbor is roughly 2 times further than an edge neighbor. This makes radial searches (e.g., "find all drivers within 2km") jagged and inefficient.

H3: The Hexagonal Solution

To solve the distortion and asymmetry of squares, Uber developed H3, a hexagonal hierarchical spatial index. If squares are the standard building blocks of digital maps, hexagons are the "Goldilocks" of geometry for moving objects.

Why hexagons? It comes down to Uniform Adjacency. A hexagon has exactly six neighbours, and every neighbour shares an edge of identical length. The distance from the centre of one hexagon to the centre of any of its neighbours is constant.

This property transforms a complex 2D geometric search into a fast, 1D bitwise lookup. By "binning" every GPS coordinate into a unique 64-bit integer H3 index, the query "find all drivers near me" becomes a simple: SELECT drivers WHERE h3_index IN (neighbour_list)

This operation is O(1) in terms of geometric complexity, allowing the system to scale regardless of how many millions of drivers are on the road.

Telemetry Ingestion: Cleaning the Noise

Having a fast spatial index is useless if the underlying data is unreliable. In practice, raw GPS streams are highly noisy due to the “Urban Canyon” effect, where signals reflect off buildings and produce jittery, inconsistent positions.

To correct this, the ingestion pipeline doesn’t treat each GPS point independently. Instead, it performs Map Matching — the process of reconstructing a vehicle’s true path on the road network from a sequence of uncertain observations.

This is where Hidden Markov Models (HMMs) come in. Rather than assuming each GPS point is correct, the system models the true road segment as a “hidden state” and the observed GPS points as noisy measurements. By evaluating both spatial proximity (how close a point is to a road) and temporal consistency (how plausible movement is between roads over time), the model reconstructs the most likely continuous path a vehicle took. This prevents physically impossible transitions, such as snapping between parallel roads or jumping across overpasses.

At this scale, this computation cannot be done in isolation or batch mode — it must operate on a continuous stream of events. This is where the streaming infrastructure becomes critical.

Apache Kafka acts as the ingestion backbone, buffering and partitioning millions of incoming telemetry events from mobile devices. It decouples producers (drivers sending GPS pings) from consumers (downstream systems performing matching and analytics), ensuring that bursts of traffic don’t overwhelm the system.

Downstream, Apache Flink consumes these streams and performs stateful, real-time computation over sequences of events. This is essential for map matching, since the HMM requires context — previous locations, time deltas, and movement history — rather than isolated points. Flink maintains this state efficiently while continuously updating each vehicle’s inferred position.

Together, this pipeline transforms raw, noisy GPS telemetry into a clean, semantically meaningful stream of vehicle positions that can be safely consumed by downstream systems such as dispatch, ETA prediction, and spatial indexing.

Algorithmic Matching: From Greedy to Global

Once the telemetry is cleaned and indexed, the system faces its ultimate test: The Matching Engine. Early dispatch systems used Greedy Matching: when a request arrived, the system simply grabbed the nearest available driver. While fast, this leads to "local optima." If Rider A requests a car and takes the nearest driver, Rider B (who might be 15 minutes from anyone else) is left waiting indefinitely. The greedy approach prioritizes the first request, not the best outcome for the entire network.

Modern platforms shift this to Batched Matching. Instead of reacting instantly to every request, the system collects demand over a short window—typically 2 to 5 seconds—and models the problem as a Bipartite Matching problem.

At the end of each window, the system builds a graph where nodes are riders and drivers, and edges represent the "cost" (e.g., ETA, distance, or a surge-based price). The algorithm then solves for the configuration that minimizes the Global Aggregate Wait Time. This allows the system to make "sub-optimal" moves for individuals (e.g., a driver passes by a close rider) to ensure the system is globally optimized (no one is left stranded).

Conclusion: The Future of the Physical Cloud

The query of how platforms track and dispatch millions of moving entities in real-time is a masterclass in modern systems design. It is not a single, magical algorithm, but a vertically integrated stack:

H3 provides the geometric efficiency for O(1) lookups.
Kafka and Flink provide the streaming backbone to turn raw noise into truth.
Bipartite Matching provides the economic optimization that balances supply and demand.

As we look toward the future, with autonomous vehicle (AV) fleets and drone deliveries, the "matching" problem will only grow in complexity. We are moving toward a future where these platforms function less like simple dispatchers and more like an "Air Traffic Control" for the entire ground transport network. We aren't just moving people anymore. If anything, we are orchestrating the flow of the entire physical world.