PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story

6 min read Original article ↗
  • 1.

    PHP at 5000Requests / Sec Hootsuite’s Scaling Story Bill Monkman Lead Technical Engineer - Platform @bmonkman

  • 3.

    Overview - SelectedCurrent Architecture Users lb1 lb2 lb3 ... Nginx Load balancers web1 web2 web3 ... Nginx web servers PHP-FPM PHP-FPM PHP-FPM PHP-FPM Memcached cluster mem1 ... Mysql cluster master slave MongoDB cluster master slave master slave shard1 shard2 Gearman cluster geard1 geard2 worker1 ... ... ... Services

  • 4.
  • 5.
  • 6.
  • 7.

    Solution - Caching Memcached. ● Distributed cache, cluster of boxes with lots of RAM, trivial to scale ● Cache as much as possible, invalidate only when necessary ● Use cache instead of DB ● No joins - decouple entities (collection caching) ● Twemproxy!

  • 8.

    “There are onlytwo hard things in Computer Science: cache invalidation and naming things.” • Phil Karlton

  • 9.

    Solution - Caching MvcModelBaseCaching MvcModelBase MvcModelMysql SocialNetwork

  • 10.

    Solution - Caching SELECT * FROM member WHERE org_id=888 set individual cache records member_1 {data} member_5 {data} member_9 {data} set collection cache member_org_888 [1,5,9] Automatic invalidation of collection cache

  • 11.

    Solution - Caching It’s hard to scale MySQL horizontally Now: ● No need to scale MySQL ● Able to serve the whole site on 1 MySQL server ● 500 MySQL SELECTs per second. 50,000 Memcached GETs. ● 99+% hit rate

  • 12.
  • 13.

    Problem Need away to perform asynchronous, distributed tasks using a single-threaded language.

  • 14.

    Solution - Gearman Gearman. ● Distribute work to other servers to handle (workers also using PHP, same codebase) ● Precursor to SOA where everything is truly distributed ● Many other solutions, queueing systems.

  • 15.
  • 16.

    Solution - Gearman Need a way to perform asynchronous, distributed tasks using a single-threaded language. Now: ● Moved key tasks to Gearman ● Another cluster, scalable separately from web ● Discrete tasks, callable sync or async

  • 17.
  • 18.

    Problem Need tostore data with the potential to grow too big to handle effectively with MySQL.

  • 19.

    Solution - MongoDB MongoDB. ● Certain data did not need to be highly relational ● NoSQL DB, many other solutions these days ● Mongo can be a pain, lots of moving parts ● Had to make our own sequencer where auto-incremented ids were necessary

  • 20.

    Solution - MongoDB Need to store data with the potential to grow too big to handle effectively with MySQL. Now: ● Multiple clusters containing amounts of data that likely would have crushed MySQL ● Billions of rows per collection, many TB of data on disk

  • 21.
  • 22.
  • 23.

    Problem With acodebase and an engineering team increasing in size, how do we keep up the pace of development and maintain control of the system? (SVN, big branches, merge hell)

  • 24.

    Solution - DarkLaunching Dark Launching. ● Wrap code in block with a specific name ● That name will appear in a management page ● Can control whether or not that block is executed by modifying it’s value ● Boolean , random percentage, session-based, member list, organization list, etc.

  • 25.

    Solution - DarkLaunching if (In_Feature::isEnabled(‘TWITTER_ADS’)) { // execute new code } else { // execute old code }

  • 26.

    Dark Launching -Reasons • Control your code • Limit risk -> raise confidence -> speed up pace of releases • “Branching in Production” • Learning happens in Production

  • 27.

    Solution - DarkLaunching With a codebase and an engineering team increasing in size, how do we keep up the pace of development and maintain control of the system? Now: ● Work fast with more confidence ● Huge amount of control over production systems ● Typically 10+ code releases to production per day ● Push-based distribution with Consul

  • 28.
  • 29.

    Problem With arapidly increasing codebase and amount of users / traffic how do we keep visibility into the performance of the code?

  • 30.

    Solution - Monitoring Statsd / Graphite. Logstash / Elasticsearch / Kibana. Sensu ● Statsd for metrics ● Logstash for log events ● Sensu for monitoring / alerting

  • 31.

    Solution - Monitoring Statsd::timing('apiCall.facebookGraph', microtime(true) - $startTime);

  • 32.

    Solution - Monitoring Logger::event('user liked from in-stream', In_Log::CATEGORY_UX, $logData);

  • 33.

    Solution - Monitoring • Visibility into the performance and behaviour of your application • Iterate upon your code, measure results • Pairs well with dark launching • Also systems like New Relic

  • 34.

    Solution - Monitoring With a rapidly increasing codebase and amount of users / traffic how do we keep visibility into the performance of the code? Now: ● Able to watch performance / behaviour in real time. ● Able to view important events both in the aggregate or very granular ● Able to control the system and watch the effect of changes

  • 35.
  • 36.
  • 39.

    Optimizations - Pushwork to users • Within reason, push work up to users • Make your users into a distributed processing grid • e.g. Stream rendering

  • 40.

    Optimizations - Performance/ Risks • Performance is more important than clean code, business reqts (in the instances where they may be mutually exclusive) • Fine line between future proofing and premature optimization • Don’t add burdensome processes, but make it easy for your team to do things the right way • Know your weak spots, protect against abuse

  • 42.

    Technologies Linux Nginx ElasticSearch Varnish PHP-FPM MySQL Jenkins Scala MongoDB Consul Gearman Redis Akka Python Memcached HAProxy jQuery ZeroMQ Backbone RabbitMQ EC2 Zend Docker Cloudfront CDN Logstash Zookeeper Kibana Statsd/Graphite Packer Vagrant Nagios VirtualBox Spark/Shark Sensu Symfony Riak Composer Websockets Comet Hadoop Ansible Git Webpack Redshift

  • 43.

    Problem With ahuge and growing monolithic codebase and over 80 engineers, how to keep scaling in a manageable way?

  • 44.

    Solution - SOA SOA. ● Split up the system into independent services which communicate only via APIs ● Teams can work on their own services with encapsulated business logic and have their own deployment schedules. ● We chose to use Scala/Akka for services, communicating via ZeroMQ ● SOA transition made easier by the “no joins” philosophy ● Tons of work

  • 45.

    Solution - SOA SOM. ● “Service Oriented Monolith” ● When splitting up a monolithic codebase, dependencies are what kill you ● Fulfill dependencies by writing interim services using existing PHP code ● Maintain the contract and future scala services will be drop-in replacements

  • 46.

    Solution - SOA With a huge and growing monolithic codebase and over 130 engineers, how to keep scaling in a manageable way? Today: ● Transitioning to Scala SOA ● PHP will still be used as the Façade, a thin layer built on top of the business logic of the services it interacts with.

  • 47.
  • 48.

    Thank You! BillMonkman @bmonkman More Info: code.hootsuite.com