Settings

Theme

Scaling Observability: Why TiDB Moved from Prometheus to VictoriaMetrics

pingcap.com

2 points by dengolius 5 months ago · 1 comment

Reader

dengoliusOP 5 months ago

This article discusses why TiDB, a distributed SQL database, migrated its observability platform from Prometheus to VictoriaMetrics.

The Problem with Prometheus At scale, Prometheus started showing limitations, especially for large enterprise customers like Pinterest.The main issues were: - High resource consumption: Prometheus used a lot of CPU and memory, leading to frequent out-of-memory (OOM) crashes. - Long recovery times: After a crash, Prometheus needed a long time to recover, sometimes failing altogether. - Limited query performance: Large queries would often fail or be very slow.

The Solution: VictoriaMetrics

TiDB switched to VictoriaMetrics and saw significant improvements: - Better resource utilization: CPU and memory usage dropped significantly, eliminating OOM crashes. - Improved query performance: Large queries that previously failed in Prometheus now run efficiently in VictoriaMetrics. - Lower costs: Reduced resource consumption and better storage efficiency led to lower operational costs.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection