How Uber Built a Real-Time Push System for Millions of Location Updates | EP: 4 Behind The Screen

6 min read Original article ↗

This is the 4th episode of my series Behind The Screen, where I’m discussing the workings of tech that enhance our daily life in the simplest way.

Here are the other episodes of this series in case if you find something interesting.

Recently I was reading how Uber’s backend system handles so many real-time location events, and I found that the system which handles it is quite interesting. So I’m talking about my learnings in this blog post. All the information provided in this post is taken from the Uber Engineering Blog.

At first Uber was using a polling-based mechanism where the mobile app was responsible for requesting data. The app requests the server for a new location every few seconds, and if a location update is available, it is sent to the app.

  • Aggressive polling was required to keep the app responsive, but it caused high resource utilisation on the server side.

  • Faster battery drain was another issue. The app will keep sending location requests to the server even when there is no new location available.

  • App cold start time increased. When the Uber app was opened, it was required to pool multiple APIs to show the latest state in the UI. This leads to multiple API calls, and the UI won’t render until most critical APIs respond.

  • At some point 80% of requests made to the Uber backend were of location polling.

This was the time when Uber realised they needed to revamp this system with a better alternative, and they built RAMEN (Realtime Asynchronous MEssaging Network). Instead of the app requesting a new location, Uber used a push-based mechanism. Now the Uber backend decides when a new location update is to be sent to the app.

This raises 3 new questions:

  • When to push?

  • What to push?

  • How to push?

Image
From Uber Engineering Blog

Uber made a microservice and named it Fireball. It was responsible for deciding when to push the data. This service listens to all kinds of events and decides if it is worthy to push the location or not. These events can be:

  • User requesting ride.

  • Driver accepting ride.

  • Change in driver or user location.

  • and etc.

As not every little location change needs to be sent to the client, this service makes sure if it is really necessary to send data to the app. Once this service decides that a push should happen, it will send this data to API Gateway.

The Fireball microservice will provide a little bit of data. The API gateway will get all the data that is required to push. This can include user locale, OS, app version and other user-related attributes. All this data will then be forwarded to RAMEN.

Once RAMEN gets all this data, it’s his responsibility to send it to the app. RAMEN is a technology that is originally built on top of the TCP protocol. For application protocol Uber had 3 options: HTTP Long Polling, Web Sockets & Service Sent Events. They decided to use Server Sent Events because of various considerations like security, mobile SDK support and binary size.

But SSE is a unidirectional protocol, so it can only send events from server to app. But Uber needed to guarantee at least one delivery for the message. This required the mobile app to acknowledge delivery. This is how it was solved.

Image
From Uber Engineering Blog
  • The client starts the connection by sending an HTTP request to /ramen/receive?seq=0 with a sequence number of 0.

  • The server responds with HTTP 200 and ‘Content-Type: text/event-stream’ for maintaining the SSE connection.

  • The server will then send all the messages.

  • Because they are using TCP, and if a message with sequence 3 is not delivered, the TCP connection will close.

  • Next time the client is expected to send an HTTP request to /ramen/receive? seq=2 This tells the server to send messages again from sequence 3.

  • To check if the connection is alive, the server sends a single-byte-sized heartbeat every 4 seconds.

  • If no heartbeat or message is received for up to 7 seconds, the connection is assumed broken and another connection is established.

  • Whenever a client sends a request to /ramen/receive with a higher sequence number than 0, it tells the server to flush all older messages.

  • But in a good network, a user may remain connected for several minutes, which can accumulate older messages. To solve this issue the app would call /ramen/ack?seq=N every 30 seconds regardless of the connection quality.

The above implementation was the initial version of RAMEN that utilised Service Sent Events and a sequencing mechanism for communication. But as for the latest tech, Uber had upgraded RAMEN to use gRPC instead of SSE.

SSE solved issues with HTTP polling, but later it also created some issues. In order to tackle those issues, Uber decided to shift RAMEN to gRPC.

  • Reliability Issues

    The delivery state of the message is not known for up to 30 seconds. Because the old implementation pushed status every 30 seconds in a good network. Uber wanted to move towards an instant and real-time approach, but this was not possible because SSE is unidirectional.

  • Double Connections

    Uber needed to manage 2 connections. One connection for SSE events and another for sequencing.

  • Binary Data

    In SSE, data is sent as JSON. Uber can’t send binary payloads such as images and speech over JSON.

I’m building an Open Source Integration Engine to make SaaS integrations easy.

You can check it: Connective

Open source contributions are also welcome in this project.

So they shifted the older SSE implementation of RAMEN to the gRPC protocol. gRPC allowed bi-directional streaming. So the mobile app and server both can send data to each other in the same connection. Also gRPC uses Protocol Buffers instead of JSON. Protocol buffers are faster than JSON.

  • Real-time acknowledgement was achieved.

  • gRPC Connect Latency (p95) was improved by 45%.

  • Push success rate improved by 1-2%.

If you want to read more in depth you can check following links:


🔒 Why subscribe?

I’m documenting my journey of learning how computers and systems actually work.

If you want:

  • to write more efficient backend systems

  • to understand why performance breaks (not just how to fix it)

  • to think beyond frameworks and APIs

then this newsletter is for you.

👉 Subscribe to get practical explanations, book breakdowns, and real engineering insights — without hype or shallow tutorials.

(No spam. Unsubscribe anytime.)

Github | LinkedIn | Twitter