Server for 100,000+ users on site at once
How many servers do I need to power a site with 100,000+ users online at once? Where would you recommended I purchase said server/severs from? It depends a lot on what they are doing. Are they all watching video or playing an interactive game? Are they commenting on an active blog or forum? Are they browsing product listings on an ecommerce site. Or just reading articles on a news site? If the traffic is write-heavy, like a forum or blog with active commenting, you'll need a beefy database to handle all of the concurrent writes. Same for the interactive game. For read-heavy sites like the video site, you would have to use a CDN to keep that sustained video transfer rate and to reduce latency for users. For the product listings or the news site, you could get away with a lighter server setup and use static caching and CDN content mirroring to ease the load on your servers It would also depend on whether this traffic is regular throughout the year, or does it spike at predictable times (think about an Apple product blog after WWDC or a coupon blog on Black Friday). If you have regular, predictable traffic, you can buy or lease dedicated servers...for traffic to spikes, look into cloud based servers like Amazon EC2 where you can scale up and down at will. At my current job (a media company that operates 850 radio stations in the US), we just have a few loadbalanced frontend servers with squid and memcache and a few database servers behind Akamai Edgecast CDN mirroring. All of the traffic is read-only (articles, transcripts, news), so no actual users ever hit our servers, except to sign up for newsletters. Show audio and video are streamed by Akamai or another provider. watching videos (embedded from other sites), listening to music, uploading photos, and sharing data. Our test site is running so slow now and it only loads 20 activities at a time. At least make sure you have a fairly modern machine (dedicated, not shared) of round 2GHz, a 100MHz link to the net instead of 10MHz and have Nginx/php-fpm instead of Apache webserver installed. This should take you a long way to your minimum requirements. If this does not do the trick you probably have to cluster or cloud several machines. Clustering is more barebones to the machine but harder to install and finetune/maintain. Clouding is to install virtual machine software which is easier but somewhat costlier with resources. Also check your SQL, often a lot of cpu cycles are wasted on shoddy database interaction. Thanks for the advice! Checking my SQL is a really good idea, I really do need to reorganize the database. How is your site structured? 1) You want to move static data off your system onto another site like s3. 2) You want to do whatever you can async. ie. if somebody shares data, you don't have to make it available immediately. Just place it in a queue and have a second system go through and deal with it. 1. Yes
2. Yes People submit data and it's displayed, 20 activities a time and if all of those activities are videos it is taking about 17 seconds to load Why is it taking 17 seconds to load, if the page you are serving them merely contains links? What programming language/web server/cloud server is your test site using? staminus.net for hosting and it's coded in php and mysql How is your test suite running? Are you running 20 instances of your site on a single test machine? Is it possible that the machine is the bottleneck? While the machine is doing its 20 tests, can you load the site another, unburdened machine? I think part of the problem is we are loading embedded videos instead of thumbnails as it's taking about 17 seconds to load the page with 20 videos now. We have a semi basic machine through staminus.net, but I really want to get my own server that can handle 100,000+ users ay a time. I haven't considered having the load on another unburdened machine but that's a really good idea. Your software matters a heck of a lot more than hardware. You pretty much have to be using a non-blocking event-based server to achieve this level of concurrency in an efficient way. Forking/threading processes will never perform anywhere near as well. Some non-blocking event-based frameworks: Node.js/Twisted/Tornado/AnyEvent/Libevent You probably want to create a program that all clients connect to and acts as a coordinator for the whole system. Any actual heavy lifting can be done on other servers (CPU/IO intensive tasks, etc). Each web client can establish one connection to your "coordinator" server process and additionally make whatever HTTP requests are necessary to save/fetch data, etc. This was historically known as "The C10K problem" as in concurrent 10,000 connections. Modern hardware and epoll/kqueue make 10K pretty easy in many cases. It would really depend what those users are doing. If they're reading a single HTML page, you could handle that fairly easily. If they're playing a game it's another story.