Varnish 4.0.2 released
varnish-cache.orgMy experience with Varnish was when we tried to use it for something that had relatively complex caching rules. The config/VCL became spaghetti: attaching values to req and restarting the flow. This appeared to be the normal way to write VCL.
We ended up writing our own system, with our own high-concurrent LRU cache, that was more tightly coupled with our application servers (and thus able to figure out what the cache key should be). It ended up being trivial to then add things like ESI, purging, grace and saint mode.
Point being, since then, I've had a hard time seeing where Varnish fits between proxy_cache for simple url+query caching, and rolling your own.
Agreed. Varnish is great if you have an existing web application that needs a quick-and-dirty caching layer built in front of it.
Varnish gives you enough configurability to do just about anything, but the end result will be a spaghetti mess of VCL, inline C and possibly custom varnish modules. The more sustainable route is to build your own caching front end, however I think it's a bit unrealistic to assume that all startups have the engineering talent (or time) in-house to implement that sort of thing.
Wow, this (and parent-post) really colors my understanding about how far one can/should go with Varnish before building something more tailored. Thanks.
I think your description might fit if performance and scalability is a non-issue for your site.
Your own caching will make your webapp (or API) scale (say 2x faster or serve 15x the amount of users), but if you need your app to take some serious beating (say 100x to 400x the amount of users with minimal ttfb) because your site/app suddenly becomes flavour of the month on the Internet, then you will really need Varnish.
After seeing several Paywall and API deployments and all sorts of other advanced business logic, such as GeoIP detection or Mobile device classification, applied in the cache layer while keeping a sane and simple VCL, I am not sure your initial point is the case. The normal way to write VCL is to keep it simple.
(Yes, I am biased, but my point is still valid).
We were handling over 10K req / sec through that layer of our system and load tested the entire system to over 100K, testing various cache hit ratios (our 95th percentile uncached response time was ~5ms for our core service). At that scale, a sudden 400x spike isn't something you worry about -- our various network providers would null route us way before we ever got there.
When we moved away from Varnish towards our own integrate cache, our hit ratio went from 50% to 80% on average, and some individual routes hit over 95% (we proactively purged+refetched in the background through a queue).
So maybe you're right, but for the opposite reason that you state. Our scale and performance requirements were so great, a custom solution made more sense.
I think that the point with Varnish and VCL is that you can adapt it, tailor it and even extend it (the language with additional functionality from external libraries i.e. cURL or memcached) to the point that it fits your system just the way you need it.
Obviously, depending on what you were trying to do and to which extend Varnish was extendable then (VMODs were added in 3.0) it might not have been the best option there is.
Anyway, thank you for your reply. It was a very interesting insight you came with there.
Does anyone have a fairly basic varnish config that deals with a) non-www to www redirects, b) works with https certificates (with the Cloudfare config) , , c) allows CORS for fonts on cloudfront and d) switches off caching for all /admin pages?
Varnish has proven to be very hard to use with nginx in the above setup. Especially, trying to understand the ssl bit is getting to be really hard.
Can it also work with spdy?
> d) switches off caching for all /admin pages?
Just get your /admin-system to send the correct Cache-Control headers, Varnish will respect the headers, and serve it correctly.
a) I just let Nginx do the redirect. Varnish can cache 302's so after the first one it won't hit Nginx again until the cache times out.
b) You have to terminate SSL connections before hitting Varnish. There is very little chance SSL termination will ever be part of Varnish. On AWS you can terminate with ELB.
c) Varnish can add HTTP headers and it will respect the headers your backend sets.
d) Varnish can disable caching for a specific path or again it will respect the No-Cache headers a backend sets.
for b), we dont use AWS - we are hosted on Softlayer cloud. Does that mean that - if you use Varnish with SSL, then you NEED to use HaProxy ?
Or is it nginx (SSL termination) -> varnish -> nginx (server) -> Rails
Yes, I usually set up Nginx just to listen to port 443 and forward to Varnish on the same machine. You could do the same with HaProxy.
It is not as bad it seems since using Nginx to terminate SSL and forward is just a really simple config file. If you have a lot of cached resources then the path is mostly Nginx -> Varnish. Varnish is probably over 100 times faster than Rails at serving anything cacheable.
Not generating the www redirect (IMO that's more of a web server / application layer job than a cache job); but you can make varnish cache www and non-www objects together:
if (req.http.host == "www.foo.net") { set req.http.host = "foo.net"; }Is this ever a good idea? Doesn't making www.foo.net and foo.net separate but identical cause an indexing split (and index ranking split) at search engines? Why would you ever want that split for the same pages at two different domains, unless they're serving different content?
Now that you mention it, I forget why that specific snippet is in my config... it is part of a block of similar-but-more-useful things (specifically, having multiple caches serving the same static content from a variety of different domain names)
so, what's the answer - this is why Varnish is so confusing. I have my non-www to www redirects working perfectly on nginx.
should I do anything special in varnish, or should I trust it to cache the nginx redirect headers by itself ?
The answer is have nginx do the redirect, varnish will handle it correctly
Varnish left a sour taste in my mouth. We have used it on one of our high-traffic sites, and had some truly bizarre hard-to-reproduce problems with it.
The major bug it had was that it would work normally for hours, but then it would randomly let the flood of requests through basically killing our servers. It happened over and over and over again. We had multiple talented sysadmins look at it, and none of them could give us any explanation. The only solution was to restart it, and warm the cache all over again. We couldn't figure out what sets it off, it looked just so random.
That sounds like a "hit-for-pass" scenario: Something from your backend told Varnish "Don't cache this" and varnish stopped doing so.
This. The thing to remember is that you always have to set beresp.ttl in vcl_fetch in a hit-for-pass situation. Varnish caches the decision to hit-for-pass (or lookup or whatever), so if you do a hit-for-pass and your TTL is 1 hour, Varnish will hit-for-pass that cache key for the next hour without running your VCL logic again.
Except we didn't have that. This was 99.9% content pages dynamically generated and nearly static. They all had roughly the same headers that haven't changed really, and they were not even aware of Varnish.
Response headers is the first thing we checked, there are only a few of them that affect Varnish:
Are you sure it wasn't an issue with cache evictions?
All pages at the same exact time?
Next time call me and I will make sure that the best minds doing cache invalidation in the industry have a look and fix your issue.
[Varnish Software sales hat on]
Varnish and caching in general can be a complete mind-freak the first few times you experience it, however after working with it nearly daily for the last 3 years, I've come to love it. It has allowed me to scale a number of WordPress websites far beyond where they had business being (34mm UV/180mm PV per month on a handful of servers). Super excited to see them continuing to build in great features.
I have never tried but failed miserably in trying to setup a Varnish front-end for a wordpress setup - especially with pretty urls. The hard part was getting varnish to work with the admin interface - especially image uploads - and comments.
Do you know of any good starting config file that can be used ?
A couple of resources for Varnish 3:
* https://github.com/mattiasgeniar/varnish-3.0-configuration-t... (fairly well maintained with CMS specific VCL examples)
* https://github.com/slashsBin/nuCache (library for different programming languages, not so CMS centric)
Both of these are listed in our utilities directory:
* https://www.varnish-cache.org/utilities
Other Varnish (VCL) extensions can be found in the VMOD directory:
* https://www.varnish-cache.org/vmods
If you have ideas on how to make both these resource more visible and accessible for our users, please let me know.
@ruben_varnish - this is great ! thanks.
I think it would be great if varnish had an official repo with different configuration templates for different stacks that could be used as "drop-in".
For example, my question about a fairly basic Rails config (https://news.ycombinator.com/item?id=8431927) had a couple of different answers disagreeing about stuff.
Also, things like SSL termination are something that the documentation skips over entirely - I understand completely that Varnish does not implement it, but Varnish necessitates the introduction of certain other players in the stack, which it should document.
For example, a canonical example of varnish setup as Nginx (SSL termination) -> Varnish -> Nginx (reverse proxy) -> App would be very useful.
It's not difficult. You have to exclude (pass or pipe) either POST and /wp-admin OR all requests with wordpress logged in cookies. That's the basics of it. Also the resources Ruben listed are super great.
hit me on twitter @kentonjacobsen
>34mm UV/180mm PV per month on a handful of servers
Can you give more details about the hardware setup?
All bare-metal, 2 basic varnish, 4 beefier web heads, 2 really, really beefy DB + NFS.
hit me on twitter @kentonjacobsen
> If you are using Varnish Cache for business, consider attending the Varnish Summits
If anyone from Varnish AS is reading this (OP?), please consider having a Varnish Summit on the east coast of the US sometime.
Soon.
Please.
We had one in NYC in May. We might be coming back next year :-)
Good news, I'm told they are planning an east coast event in the spring.
Thank you very much!
I'll let the right people know. Thanks for the feedback.
Two questions: (1) Anyone use Varnish in front of Rails? Is it better to just set up some memcached and/or redis and let the app handle it? (2) To deal with session-specific stuff, how common is it to render some common plain html views, then fill in the particulars with JS/Ajax? If you need to support no-JS (heaven forbid) do people use iframes?
Check out ESI blocks: https://www.varnish-cache.org/trac/wiki/ESIfeatures
Varnish will very likely be an order of magnitude or two faster than sending requests to your app. You can get going with a simple config in a few minutes and without breaking anything (just have varnish listen on a different port and use :80 or your app server as the back-end), so I'd strongly suggest playing with it if you're curious. It's pretty impressively fast software, and VCL gives you an awesome abstraction for controlling how the cache handles different routes.
Answer to (1): Heroku used Varnish in front of EVERYTHING in their previous stack... So I guess that means that whatever Ruby (or RoR) app you are running you should look into using Varnish.
Has anyone documented how to convert a 3.x config to 4.x config yet? The documentation on the site assumes a level of understanding of the internal workings I don't have.
I'm still running 3.x because they broke a bunch of stuff. I discovered this when I updated a couple of dev boxes and varnish stopped working.
You can use these:
* Upgrading notes: https://www.varnish-cache.org/docs/4.0/whats-new/upgrading.h...
* A script by Fed that will do 70-90% of the job for you: https://github.com/fgsch/varnish3to4
Let us know how it went :-)
We've been long vacillating on using Varnish to power our E-commerce store but from what we've read, Varnish is not a good solution in cases where cookies are part of most of web traffic. Is this true?
Varnish is extremely flexible. Its primary design goal is to be a programmable reverse proxy cache -- so it's almost certainly a better choice for a challenging session-dependent use case such as e-commerce than repurposed web servers (such as Apache, nginx, etc).
If every single dynamic page of your site requires knowledge of the session, and must hit the backend, then you're kind of stuck. But there are lots of ways to design around that.
For example... Edge Side Includes, supported by Varnish, let you do front-end caching of page fragments, stitched together by the proxy rather than the web application. This can give you a huge boost (to performance and scale), even if you have to hit the web application for some part of the page.
Keep in mind, once you start adding front-end caching to your application, you need to think of your cache as part of your app. :-)
You might want to take a look at the Turpentine module for Magento, which tightly integrates Varnish with Magento. It's a good example of retrofitting front-end caching, cookie generation, and ESI fragment caching into an existing e-commerce application.
Every site has something that can be cached. Images, js, headers, popular product lists, etc.
If a specific page item can be cached is really up to if the backend application takes cookie state into account when the page item is made.
The Varnish default of not caching anything when cookies are present is because we have no idea if the backend writes "Welcome back, krat0sprakhar!" when it sees the username in an incoming cookie. Sending that item to every user would be unfortunate.
Other than that, I'd recommend you evaluate your needs and not just pick a technology. If your page response times are low enough already, and your backends/appservers scale to as many concurrent buyers you think you need, you're good without Varnish/caching.
I'd add that edge side includes (https://www.varnish-cache.org/trac/wiki/ESIfeatures) can be amazingly useful. Even if you've offloaded all of your static content to a CDN, the ESI features of varnish alone can be a huge win.
This is false. Default configuration does not cache any pages with cookies. You can configure varnish to ignore cookies. But as others have stated, Varnish can not magically guess which cookies to ignore and which will modify your content. If your site has /admin, you can tell varnish to not cache that. The cleanest solution is to never ever set any cookies for url-s you want to cache. So if you set a login cookie you set it to /admin.
If you want to know more about it, there is a funny thread about it https://www.varnish-cache.org/lists/pipermail/varnish-misc/2...
In addition to other answers - it depends what the cookies are for -- if your users only get a session cookie after logging in (before that they only have front-end cookies for things like google analytics / remembering which widget on the site is currently active / etc) then you can ignore front-end cookies, thus successfully caching for all not-logged-in users:
In my read-mostly use-case, this allows ~90% of site traffic to be served by varnish, even though the site is totally dynamic (for not-logged-in users, all the state that we care about is in the GET parameters).if(req.http.Cookie) { # ignore front-end cookies (prefixed with "ui-") set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(ui-[a-z-]+)=[^;]*", ""); # ignore google cookies set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(__[a-z]+|has_js)=[^;]*", ""); # Remove a ";" prefix, if present. set req.http.Cookie = regsub(req.http.Cookie, "^;\s*", ""); # Remove empty cookies. if (req.http.Cookie ~ "^\s*$") { unset req.http.Cookie; } }This is the way to do it.
Another way of doing it is to exclude all cookies, and only care about session-cookies.
I just throw away the default config and write a vcl to do what I want.
You can cache while using cookies. If you use a cookie to change the page in any way add Cookie to you Vary headers. In Varnish I don't cache response with Vary ~ "Cookie". I also don't cache any response that sets a cookie.
Now if you have a backend that does modify the page using what is in a cookie but doesn't set the Vary header, then using Varnish can be quite a pain. This is hardly Varnish's fault that the backend is crappy.
This is false. I have personally configured Varnish to power blogs, marketplace websites, and a Magento powered E-commerce store.
If you've worked with Magento then you know it relies heavily on cookies and sessions. Even with all difficulties of modifying Magento I was able configure Magento and Varnish to work together perfectly within a week.
I can point you to the Magento site running with Varnish if you send me an email.
You can use Edge Side Includes to get around this. Essentially, you write a page to Varnish containing ESI tags, and it caches that page. The ESI tags, otoh, are parsed every time the page is accessed.
So it's useful if some amount of your content is always the same, with other parts being requested per request (or cached depending on cookies, or whatever else).
If your site is not completely dynamic there is no sane reason to hit the database or disk for every visitor. Serve it out of the cache. Improve your visitors' experience, and lower the load on your hardware to insignificant levels.
Nearly every site uses cookies in some way. Many of them (like mine!) are still able to make excellent use of Varnish.
I had to click three links to discover what Varnish is. So either I'm dumb or blind either there is no single description of what the product does on the home page copy.
It's probably assumed to some degree that you know what it is at this point. It's the industry standard HTTP full-page cache daemon.
I love varnish announcements. Their messages seemingly have a 78% likelihood of being from the machines themselves.
And, frankly, I think the machines _are_ happy to make the announcement.