Chris Pirillo’s CDN in a Box looks to make Web sites hum at ‘Google-like speeds’
geekwire.comSeattle super geek Chris Pirillo has a loyal following who love his offbeat musings on technology. But sometimes Pirillo’s network of sites, including LockerGnome.com, get overloaded with traffic. Faced with that challenge, Pirillo and his team have developed a new service called CDN in a Box that’s designed to handle huge traffic spikes and accelerate page loads to “Google-like speeds.” At first, I thought, "Why would I want this? My blog (shameless plug: http://scottporad.com) never gets that much traffic. Then it occurred to me why: we all hope to get slashdotted at some point. That's how our blog readership grows, right? But, when that happens, if my server goes down, then what's the point? So, basically, this is an insurance policy, and most likely well worth it. It's WAY More than an insurance policy. (It's that too) Making your site faster is good for SEO which can help you go from "never gets that much traffic" to "I get a fare amount of traffic". Interesting point - but how is it different than CloudFlare? We don't block the Robots, so you don't get delisted by Google for using the service. That'd be a big difference. CloudFlare doesn't block legit crawlers either. It does cache responses to crawlers so if a page hasn't changed and Google crawls it again the request doesn't burden the origin. What's interesting about CDN in a Box is they're serving off a single IP. The problem with this strategy is Google classifies sites for crawl purposes by IP. That means if one site on CDN in a Box falters, all the other sites on CDN in a Box will suffer (e.g., Google turning down crawl velocity or completely removing them from the index). The same problem occurs if there's anything spammy or compromised by malware. At CloudFlare, we tried the CDN in a Box strategy when we launched more that a year ago. We quickly found it had serious negative impacts on site rankings. We spent considerable time working directly with Google and the other search engine crawl teams on a solution. Today, sites on CloudFlare actually get the highest crawl velocity setting because of this work, which we've seen positively impact site rankings. I'm curious to hear more about CDN in a Box's plans, discussions with search engine crawler teams, and technologies they've developed to overcome this challenge. I have personally had to get sites that use CloudFlare re-listed after being booted from Adsense, or Google because Cloud Flare served a different page to bots than to users and kicked off the Access Restricted Page. You keep saying we serve off of one IP that is blatantly false. I'll put my Crawl Rate Up Against anyone's because we had to have a conversation with Google's team because their Bot hit one of our sites for 1.2M crawled pages in 3 hours. Which is nice, but then they did it again the next day. and the next. So we are negotiating to not have to pay for Google bot traffic. In webmaster's tools you can't even change the setting of a CDN In A Box Site, because Google Assigns you a Special Crawl rate. Bing's Bot loves us, because they will often crawl "all pages at once" so they will crawl 10k pages in 30 seconds. and go on to the next site. Also I confirmed with Google: "Because of the propensity of shared hosting providers, the recycling of IP addresses by cloud services, and the ease at which IP addresses can be changed, we never use IP address as a factor for assigning a penalty to a website. In certain instances where malware is hosted on a site, we may display the "this site may be harmful" warning against all results on an IP address. The rules for when this happens are not my expertise, but from the prospective of search, your site will never receive a penalty for sharing an IP with a blocked, banned, or delisted site." Not to rain no anyone's parade here, but real CDN providers like Akamai, Cotendo and EdgeCast offer a far superior service for far less money. This thing is hosted on GAE, which is not fast, and doesn't have the global footprint of even the smallest CDN player. GAE is fast for things Akamai isn't. GAE actually has a more global foot print, because it has more peering agreements than any provider on the planet, and while you would be right if this was a CDN for pushing 1 Gig files, it is a CDN for pushing your 60k images all over the place. The cost of a Cache Miss on Akamai for a small file is VERY high, and nearly non-existent on GAE. Building the right tool on the right platform for the right kind of job. (oh and pricing, You can't even get started with Akamai for less than $1000 a month, which goes a long ways on that pricing thing. Plus Akamai's 8-12 cents is not really 12 cents when you pay for storage, and all of the other nickel and dime things. CDN In A Box is just Simple Pricing, and Simple Deployment. ( you ever tried to write a blog post then upload the images to Akamai?) and how different from akamai dynamic site accelerator :