How AfterHour built an ultra-scalable chat service in one month with Rama

"I built my first chat system in 1999 on the IRC protocol, cofounded a startup for companies to self-host chat rooms, and have built real time multiplayer games. I am intimately familiar with the challenges inherent in running real time messaging, presence, and pub-sub software. At AfterHour we knew that chat was going to be core to our product and, as a general engineering philosophy, do not outsource core functionality to vendors and no open source chat servers are able to handle consumer app usage levels.

Rama provides exactly the type of persistence, routing, and data locality for processing that is required to handle chat as scale. We would have easily spent 3-6 months building an inferior backing to our chat system if it were not for Rama. We are running many rooms with thousands of concurrent users in each and do not see any reason why it won’t continue to scale as we add more users, more features, and more rooms."

JD Conley, Head of Engineering at AfterHour

AfterHour is a social platform for investors. The app offers transparent trading data, live chat rooms, and real-time signals to help users collaborate and learn from each other. With $400M+ in connected portfolios and 150K+ users, the app blends transparency with community to help people make money together.

The backend of the chat portion of AfterHour is built with Rama. It implements the API defined in the Matrix spec, an open standard for real-time communication. The service took one month to build, including the time to learn to use Rama, and has been running flawlessly in production since March 2024.

Rama is both a storage and computation platform for building entire backends end-to-end. It does much more than what a database does, since you program your business logic into your Rama applications. All Rama applications are event sourced with all new data coming into unindexed logs called “depots”, and the application code reacts to new depot records to materialize any number of indexed stores called “PStates”.

AfterHour’s chat implementation handles users, rooms, messages, presence notifications, and typing notifications. It’s a single Rama module coded with Rama’s Java API. The module contains:

PStates

Indexed datastores in Rama are called PStates (“partitioned state”). The PStates for AfterHour’s module have a variety of shapes in order to precisely match the indexing needs of the use cases they support. One of the simplest is $$roomNames , which maps chat room names to chat room IDs. It’s equivalent to a key/value database:

1	s.pstate("$$roomNames", PState.mapSchema(String.class, String.class));

PStates are defined as data structures, and they’re distributed, durable, and incrementally replicated. Being defined as the composition of data structures enables PStates to represent infinite data models, and the operations a PState can efficiently handle are the same operations regular data structures of that shape can handle efficiently.

This PState is a simple map with string keys and values. The queries it can do quickly (less than one millisecond) are lookups by key.

Another PState, $$roomInfo , tracks properties for each chat room ID. It’s equivalent to a document database:

1
2
3
4
5
6

s.pstate("$$roomInfo",
PState.mapSchema(String.class,
PState.fixedKeysSchema("roomType", RoomType.class,
"joinRule", JoinRule.class,
"name", String.class,
"creatorId", String.class)));

The values for the top-level map are another map with a predefined set of keys. Each key in the inner map defines its own schema. Some of the queries this PState supports in less than a millisecond are looking up the inner map for a room ID, looking up one particular attribute for a room ID, or fetching a subset of attributes for a room ID.

The PState $$roomMembers tracks which users have joined a room and which ones are moderators (“power level”). It’s equivalent to a column-oriented database:

1
2
3

s.pstate("$$roomMembers",
PState.mapSchema(String.class,
PState.mapSchema(String.class, PowerLevel.class).subindexed()));

The inner map in this case is “subindexed”, which means it indexes each of its elements individually instead of serializing as one value. This means the inner map can contain billions of elements and still be read/written quickly. This PState can support many queries in less than a millisecond: look up the number of users in a room, fetch a range of users and their power levels in a room, look up a particular user’s power level in a room, and check if a particular user is in a room.

$$memberRooms tracks which chat rooms each member has joined, which is the opposite of the previous PState:

1
2
3

s.pstate("$$memberRooms",
PState.mapSchema(String.class,
PState.setSchema(String.class).subindexed()));

There’s no additional properties that need to be tracked for chat rooms in this context, so the inner data structure is a set instead of a map. The kinds of queries this PState supports in less than a millisecond are similar to the previous PState.

These are just a few examples of PStates in AfterHour’s module, demonstrating how each PState is tuned for exactly what the use cases need. Without Rama, either multiple databases would need to be used, which adds a tremendous amount of integration and deployment complexity, or everything would need to be forced into a non-optimal data model, which creates impedance mismatches.

Topologies

Topologies in Rama define how to react to incoming data appended to depots and materialize PStates from that data. There are two types of topologies. Stream topologies process data as soon as it arrives and have single-digit latency to complete processing. A stream topology has either at-least once or at-most once processing semantics depending on how it’s configured. Microbatch topologies process batches of data across the whole cluster and have update latency on the order of a few hundred milliseconds. They have exactly-once processing semantics, handling failures and retries in a way such that the result in PStates is as if the data was processed exactly one time.

Both stream and microbatch topologies have high throughput, but microbatch topologies have even higher throughput since there’s no per-record overhead. The greater throughput and stronger fault-tolerance semantics means microbatch topologies are preferred unless the use case needs very low latency.

Most of the functionality of Afterhour’s chat service needs that low latency, such as posting messages and joining rooms. So most of the functionality of the service is implemented by their one stream topology. The two microbatch topologies handle use cases that don’t need such low latency: presence notifications (whether a user is idle or active on their computer) and typing notifications. These use cases are fine with a few hundred milliseconds of latency, so they’re implemented with microbatch topologies.

Summary

Rama helped AfterHour build a major component of their application in a short period of time. Other than upgrading their cluster every now and then to keep up with the latest Rama versions, their cluster has required no maintenance whatsoever. Their application is also fully scalable to any read/write load, showing that there doesn’t need to be a tradeoff between ease of building and scalability.

You can get in touch with us at consult@redplanetlabs.com to schedule a free consultation to talk about your application and/or pair program on it. Rama is free for production clusters for up to two nodes and can be downloaded at this page.