Google top 1000 sites: Interesting stats about them
blog.sucuri.netNginx with 14% of market share is indeed very interesting, very close to IIS (17%).
The stats can be misleading, nginx is very good at being a reverse proxy or software load balancer and tends to be put to use in those contexts with pass-thru to existing web servers.
Because the stats look at headers, the last header before hitting the internet will be the nginx caches.
How can they detect the programming language in use other than by looking at .php, .aspx, .jsp, etc? You won't see this on a most professionally-authored sites that use a router and RESTful URLs.
That's not what RESTful means. What your URLs look like has absolutely nothing to do with REST -- the whole point is that it treats URLs as opaque references to other similarly hypertextual resources.
I believe via HTTP headers. Sometimes (or maybe by default?) PHP installs will add something PHP-specific to the Server: line. I remember having to go in and disable that at one point...
Have a look at Wappalyzer if you're interested in usage statistics for web based applications. It also lists the most popular websites per app.
I'd like to see sites that use mootools, Prototype, YUI, Dojo, etc. Those are some simple statistics to compile I'd think (filename based, or just simple regex of the first 200 characters).
http://wappalyzer.com/stats/cat/JavaScript%20frameworks (click the links for more detail).
Where's google on that list?
Google said that the list excludes, "adult sites, ad networks, domains that don't have publicly visible content or don't load properly, and certain Google sites".
See: http://www.google.com/support/adplanner/bin/answer.py?hl=en&...
2- Programming language in use: PHP: 15.3% ASP.net: 14.4% Java: 1.6%
How do they even tell, given that sites can avoid having .php/.asp/.jsp in their URLs if they want to?
We checked the extensions, the "Server:" option in the header and the "X-Powered-by" option. We tried our best:)
So were the other 68.7% all unknown?
Well, given that those stats add up to a lot less than 100%, one assumes that they cannot in most cases.