Webinar - Approaching 1 billion documents with MongoDB

More Related Content

MongoUK - Approaching 1 billion documents with MongoDB1 Billion Documents

Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger

Lessons Learned Migrating 2+ Billion Documents at Craigslist

Tarantool: как сэкономить миллион долларов на базе данных на высоконагруженно...

Fusion-io and MySQL at Craigslist

Understanding and tuning WiredTiger, the new high performance database engine...

Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...

What's hot

GlusterFS As an Object Storage

Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)

MySQL And Search At Craigslist

Setting up mongodb sharded cluster in 30 minutes

Linux Kernel Extension for Databases / Александр Крижановский (Tempesta Techn...

MongoDB Memory Management Demystified

Making the case for write-optimized database algorithms / Mark Callaghan (Fac...

[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciat...

Avoiding Data Hotspots at Scale

Update on Crimson - the Seastarized Ceph - Seastar Summit

Realtime Search Infrastructure at Craigslist (OpenWest 2014)

Performance tuning in BlueStore & RocksDB - Li Xiaoyan

Evaluation of RBD replication options @CERN

Sphinx at Craigslist in 2012

RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai

Redis persistence in practice

Viewers also liked

You know, for search. Querying 24 Billion Documents in 900ms

Living with SQL and NoSQL at craigslist, a Pragmatic Approach

MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)

Midas - on-the-fly schema migration tool for MongoDB.

Probabilistic algorithms for fun and pseudorandom profit

MongoDB for Time Series Data

MongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management

Scaling massive elastic search clusters - Rafał Kuć - Sematext

Similar to Webinar - Approaching 1 billion documents with MongoDB

MongoDB Best Practices in AWS

Deployment Strategies (Mongo Austin)

Evaluating NoSQL Performance: Time for Benchmarking

MongoDB and AWS Best Practices

Keeping MongoDB Data Safe

Optimizing MongoDB: Lessons Learned at Localytics

MongoDB: Advantages of an Open Source NoSQL Database

KVSの性能、RDBMSのインデックス、更にMapReduceを併せ持つAll-in-One NoSQL: MongoDB

MongoDB and server performance

Andy Parsons Pivotal June 2011

More from Boxed Ice

MongoDB Tokyo - Monitoring and Queueing

MongoUK 2011 - Rplacing RabbitMQ with MongoDB

MongoDB - Monitoring and queueing

MongoDB - Monitoring & queueing

Monitoring MongoDB (MongoUK)

Monitoring MongoDB (MongoSV)

MongoUK - PHP Development

Recently uploaded

Quick Wins with Slackbot - Slack Community Patna

Google Data Center Security: Physical Security, Access Control & 24/7 Monitor...

Building a Central Data Hub with FME: Repower’s Path to a Digital Twin

Smartcomply & Techcabal - AI & the Cyber Frontier East Africa Report 2026

"AWS Audit-Driven SRE: turning findings into measurable improvements at scale...

Dev Dives: Build production-ready process apps with pro-code & AI

SRE Made Easy: Comprehensive Guide to Site Reliability Engineering and Servic...

Presentation - How Google Search Works (3).pdf

Top 8 AI Virtual Dressing Room Tools in 2026

Enabling high-precision GNSS positioning using existing grandmaster infrastru...

NANOTECHNOLOGY, ITS MAIN APPLICATIONS AND IMPACTS ON SOCIETY.pdf

Risk to Patient Delta, Excipients in Digital Twins and Pharma 5.0 - NIPTE 2030

Renewable Energy Technology: Modern Techniques for a Sustainable Future

Databricks Demystified_ Unleashing the Power of Unified Data & AI for the Mod...

TrustArc Webinar - How Leading Teams Run and Prove ROI from Privacy Operations

Towards Better JVM Performance Presentation

Enhancing Content Moderation with Dual-Embedding Trust Scoring Using LLM Summ...

How the Internet Works: Complete Guide to Internet, Search Engines, Crawling,...

DIY Arduino Game Controller: Complete Build Guide & Code for PC Gaming

Inside Google’s Digital Fortress: Data Center Security, Cloud Infrastructure ...

Webinar - Approaching 1 billion documents with MongoDB

1.
Approaching 1 Billion Documents in MongoDB David Mytton 1/25 david@boxedice.com / www.mytton.net
2.
3.
db.stats() Documents 981,289,332 Collections 47,962 Indexes 39,684 Data size 369GB Index size 241GB 3/25 As of 25th Apr 2010.
4.
5.
Initial Setup Replication Master Slave DC1 DC2 8GB RAM 8GB RAM 5/25
6.
Vertical Scaling Replication Master Slave DC1 DC2 72GB RAM 8GB RAM 6/25
7.
Tip #1 Keep your indexes in memory at all times. db.stats() 7/25
8.
Manual Partitioning Replication Master A Slave A DC1 DC2 16GB RAM 16GB RAM Replication Master B Slave B DC1 DC2 8/25 16GB RAM 16GB RAM
9.
Database vs collections • Many databases = many data ﬁles (small but quickly get large). • Many collections = watch namespace limit. 9/25
10.
11.
Tip #2 Monitor the 24,000 namespace limit. 11/25
12.
13.
Console db.system.namespaces.count() 13/25
14.
Replica Pairs =Failover Replica Pair Master A Slave A DC1 DC2 16GB RAM 16GB RAM Replica Pair Master B Slave B DC1 DC2 14/25 16GB RAM 16GB RAM
15.
Tip #3 Pre-provision your oplog ﬁles. 15/25
16.
A shell scriptto generate 75GB oplog ﬁles for i in {0..40} do echo $i head -c 2146435072 /dev/zero > local.$i done 16/25
17.
Tip #4 Expect slower performance during initial replica sync. 17/25
18.
Tip #5 You can rotate your log ﬁles from the console. 18/25
19.
20.
Tip #6 Index creation blocks by default. Use background indexing if necessary. 20/25 MongoDB Manual: http://bit.ly/mongobgindex
21.
Tip #7 Increase your OS ﬁle descriptor limit + use persistent connections. 21/25
22.
Too many openﬁles! /etc/security/limits.conf mongo hard nofile 10000 mongo soft nofile 10000 user type limit /etc/ssh/sshd_conﬁg UsePAM yes 22/25
23.
24.
Tip #8 10gen commercial support is worth paying for. 24/25
25.
Summary 1. Keep indexes in memory. 2. Monitor the 24k namespace limit. 3. Pre-provision oplog ﬁles. 4. Expect slower performance on replica sync. 5. Rotate logs from the console. 6. Index creation blocks by default. 7. OS ﬁle descriptor limit + persistent connections. 25/25 8. Commercial support is worth it.