The infrastructure behind modern ranking systems (serving, data, MLOps)

1 points by semi_sentient 4 months ago · 2 comments

Reader

Follow-up posts for context (same series):

Part 2 – Data Layer (feature store to prevent online/offline skew; vector DB choices and pre- vs post-filtering): https://www.shaped.ai/blog/the-infrastructure-of-modern-rank...

Part 3 – MLOps Backbone (training pipelines, registry, GitOps deployment, monitoring/drift/A-B): https://www.shaped.ai/blog/the-infrastructure-of-modern-rank...

Happy to share more detail (autoscaling policies, index swaps, point-in-time joins, GPU batching) if helpful.

semi_sentientOP 4 months ago

Modern ranking systems (feeds, search, recommendations) have strict latency budgets, often under 200 ms at p99. This write-up describes how we designed a production system using a decoupled microservice architecture for serving, a feature + vector store data layer, and an automated MLOps pipeline for training → deployment. This is less about modeling, more about the infrastructure that keeps it all running.

Settings

The infrastructure behind modern ranking systems (serving, data, MLOps)

Keyboard Shortcuts