How We Scaled Pinterest From Zero Users To A $2 Billion Valuation

4 min read Original article ↗

This story is available exclusively to Business Insider subscribers. Become an Insider and start reading now. Have an account? .

Ben Silbermann Pinterest

Ben Silbermann, CEO at Pinterest  Flickr/Pinterest HQ

Pinterest engineers Yashwanth Nelapati and Marty Weiner recently shared some insights and lessons learned while developing and scaling the company.

Pinterest is a site for collecting and sharing photos of interesting objects around the web.

In a little over three years, Pinterest has grown from zero page views a day to billions per month. The most recent tally: 3.4 billion monthly page views from its 25 million members worldwide.

Today, Pinterest is valued at $2.5 billion following a $200 million financing round earlier this year. 

This slideshow talks about the software and hardware Pinterest used to get where it is today. Warning: It's for geeks only!

Here we go!

pinterest scale 1

InfoQ

You can follow pins and boards from people you know.

pinterest scale 4 4

InfoQ

Here, you can see the pins from everyone you follow.

pinterest scale 5 5

InfoQ

So users have boards and relationships.

pinterest scale 7 7

InfoQ

Here's Pinterest's page view count at the beginning.

pinterest scale 8 8

InfoQ

In March 2010, the team wasn't working with much from an infrastructure point of view.

pinterest scale 9 9

InfoQ

Nine months later...

pinterest scale 10 10

InfoQ

The product and architecture evolved.

pinterest scale 11 11

InfoQ

Pinterest started doubling page views every month and a half, but everything was breaking.

pinterest scale 12 12

InfoQ

So they ended up with five major technologies just for the data alone.

pinterest scale 13 13

InfoQ

So they started dropping off technologies and did a massive restructuring of the architecture.

pinterest scale 15 15

InfoQ

Here's what they changed the architecture to.

pinterest scale 16 16

InfoQ

Pinterest's web traffic continued to increase.

pinterest scale 17 17

InfoQ

Pinterest started to put more resources into its architecture to handle its growth.

pinterest scale 18 18

InfoQ

Pinterest uses Amazon EC2/S3 for a few reasons. The main one: you can have new instances ready in a matter of seconds.

pinterest scale 19 19

InfoQ

But there is limited choice.

pinterest scale 20 20

InfoQ

Open source database MySQL has proven to be solid choice for Pinterest. It's incredibly mature and you can hire for it, as lots of engineers know MySQL.

pinterest scale 22 22

InfoQ

Memcache is also incredibly mature, and it never crashes.

pinterest scale 23 23

InfoQ

Redis isn't very mature, but it's simple.

pinterest scale 24 24

InfoQ

Pinterest realized that during its rapid growth, it needed to spread the data evenly to handle the load. So they defined a spectrum of options between clustering and sharding.

pinterest scale 25 25

InfoQ

With clustering, everything is automatic.

pinterest scale 26 26

InfoQ

But sharding is a completely manual data placement process. It's used to separate databases into smaller, faster, and more manageable data pieces called shards.

pinterest scale 27 27

InfoQ

If there's a massive bug, it will impact every single node. A SPOF brought down Pinterest four times.

pinterest scale 35 35

InfoQ

With sharding, everything is manual. And that's a good thing.

pinterest scale 38 38

InfoQ

If your project has a few terabytes of data, you should shard as soon as possible. When Pinterest's Pin table reached one billion rows, the indexes ran out of memory. That's when the company decided to shard.

pinterest scale 39 39

InfoQ

So Pinterest froze some of its features to start the transition from clustering to sharding.

pinterest scale 40 40

InfoQ

The less data you move, the more stable your architecture will be.

pinterest scale 41 41

InfoQ

Since they wanted to shard on MySQL, they projected growth for next five years.

pinterest scale 42 42

InfoQ

Pinterest initially put their databases on 8 physical servers.

pinterest scale 43 43

InfoQ

For high availability, Pinterest ran MySQL in multi master replication mode.

pinterest scale 44 44

InfoQ

With an increased load on a database, Pinterest replicated a server to handle some of the data nodes.

pinterest scale 45 45

InfoQ

Since Pinterest is on AWS and MySQL queries took about 3 milliseconds, they decided to build the location into the ID.

pinterest scale 46 46

InfoQ

If Pinterest has 50 IDs, for example, they split them up and run them in parallel. This is what Pinterest's lookup/rendering structure looks like.

pinterest scale 47 47

InfoQ

All of Pinterest's data falls into two categories: objects or mappings.

pinterest scale 49 49

InfoQ

How Pinterest brings up a user profile. Most of the calls are served from the cache (Memcache or Redis)

pinterest scale 50 50

InfoQ

Pinterest built a huge scripting farm to move 500 million Pins and 1.6 billion follower rows. Scripting is what happens when you need to move from the old, unsharded system to the sharded one.

pinterest scale 51 51

InfoQ

Need to give your brain a rest?

Google Tel Aviv Office

Itay Sikolski

Read next

Megan Rose Dickey formerly covered tech startups focused on the shared economy and music industry. Dickey previously wrote for LAUNCH Media, where she covered the startup industry and organized tech conferences.She graduated from the University of Southern California in 2011 with a degree in Broadcast and Digital Journalism.