Going Crazy with Caching – HTTP Caching and Logged in Users

HTTP caching is an efficient way to make your application scalable and achieve great response times under heavy load. The basic assumption of HTTP caching is that, at least for some time, the same web request will lead to an identical response. As long as “same” means simply the same domain name and path, you will get many cache hits. When users are logged in, we have the opposite situation, where potentially everybody will see different content. Lets take a closer look to see where we can still find safe uses for HTTP caching even with logged in users.

Controlling the HTTP Cache Behaviour

A HTTP request is not only the URL, but also the headers. Some are only for statistics or not relevant for your application. But for some web applications, there are relevant headers. The Accept-Language header can be used to decide on the content language, or when building an API, the Accept header can be used to choose whether to encode the answer in JSON or XML.

HTTP responses can use the header Vary to declare what headers lead to distinct responses on the same URL. A HTTP cache uses the Vary to keep the variants of the same URL apart. This works well when there are few variants – you will still get frequent cache hits. However, if every request comes with a different header, caching on the server side makes no sense anymore. There is no benefit in storing results in the cache that will rarely be reused. Even worse, this is a waste of resources that should be used for caching relevant data.

For this reason, caching proxies like Varnish will by default not attempt any caching as soon as there is a Authorization or Cookie header present in the request. Cookies are commonly used to track a session in the application, meaning the user might see a personalized page that can not be shared with any other user. If you force caching with cookies and have your application send a Vary: Cookie header, you will have the situation described above, where you get no value out of your cache.

The rest of this article will dig into various aspects of what you can do to still do some HTTP caching:

  • Avoid Session Cookie, remove when no longer needed
  • Delegate to the frontend: “Cheating” with Javascript
  • Different cache rules for different parts
  • User Context: Cache by permission group

Continue reading about Going Crazy with Caching – HTTP Caching and Logged in Users

Tags: ,

On the Migros API – Core principles and challenges

At Liip, we develop an API that provides product information and other data for Migros, the biggest retailer in Switzerland. Let us explain the core principles and challenges we had and how we solved them (or not yet).

Data of the API is stored in an ElasticSearch cluster. On top of that, we have a couple of application servers that run a Symfony2 application. The requests all first go through a Varnish reverse proxy cache. The Symfony2 application also provides commands that are run by cronjobs to import data from a number of systems at Migros into the ElasticSearch cluster. The main data source is their Product Information Management System (PIM), which is where product information for the whole company is entered and managed.

Continue reading about On the Migros API – Core principles and challenges

Tags: , , ,

Integrate Varnish and Nginx into PHP applications with FOSHttpCache

Earlier this week, I released version 1.0 of the caching proxy library FOSHttpCache and the Symfony2 FOSHttpCacheBundle. The library implements talking to caching proxies to invalidate cache, the bundle integrates the library into Symfony2 and adds other caching related features to Symfony2.

The library is all about talking to caching proxies to do active cache invalidation. It also provides the foundations for the User Context caching scheme that allows to share caches that do not depend on individual credentials but common roles or permissions of users.

The Symfony2 bundle FOSHttpCacheBundle integrates those features into Symfony2, using the configuration, providing annotations for invalidation, invalidating by Symfony2 routes and a default user context implementation based on the user roles. On top of this, the bundle allows to configure caching headers on request patterns, similar to the Symfony2 security component, and adds support for cache tagging.

Both library and bundle are well documented and thoroughly tested. The testing setup includes integration tests running a Varnish or Nginx instance. The test cases can also be reused for functional testing in your applications.

The cache header rules concept has been ported over from the LiipCacheControlBundle which is now deprecated. A migration guide is provided to update your projects.

Development of this started when I met David de Boer from driebit at last years SymfonyCon in Warsaw. He presented DriebitHttpCacheBundle which started from the cache invalidation, while the LiipCacheControlBundle started at caching headers. But both bundles had started to overlap already. We decided to build one common codebase that does both tasks. After more than half a year of writing better tests, refactoring and properly documenting, the code is now in a very good state. Besides David, a shoutout has to go to Joel Wurtz for contributing the user context implementation, Simone Fumagalli for the Nginx integration, Christophe Coevoet (stof) for his valuable feedback and reviews, and all the other contributors of the library and the bundle (some of those coming from the LiipCacheControlBundle git history).

Tags: ,

Collecting performance data with varnish and statsd

One of our currently bigger projects is soon going online and one part of it is an API, which only delivers JSON and XML to a lot of possible clients. To speed things up, we use varnish in front of the application (a lot of requests are easily cacheable, since they only change once a day at most). We also use HHVM for some of the requests, for now just the ones which will have many misses (from livesearch requests for example). We don’t dare yet to use HHVM for all requests, we’d like to gather some experience first with it.

One important task for such a project is to monitor the performance of the application. We usually use New Relic for that, but there’s no New Relic for HHVM (yet), so we had to look into other solutions.

Since varnish is the entry point for all requests, be it PHP or HHVM or another backend, we searched for possibilities there. We found 2 vmods, libvmod-timers and libvmod-statsd, which make that possible.

Collecting response time and number of requests

It needs a statsd and a backend for collecting the data. We decided to test librato as the backend as we have heard good things about it (btw, if you have to configure many metrics in librato with the same attributes all over, use the API of librato, it’s more powerful than their webfrontend and you can script it for later new metrics)

This setup allows us to collect detailed data about the performance of the different backends (and also monitor for example if there are too many 500 answers)

Compared to the examples in the blog posts about the vmods, I made some small changes, to get the correct data for us.

To be sure our API is “always” up, we use the fallback director of varnish, especially for the HHVM part (we have much more experience in running a PHP server than a HHVM, that’s why). It looks like the following

This means, that varnish will normally use the hhvm_api backend, but if that is down, it will use the php_api backend. They run on the same server, but listen on different ports (of course, the whole thing is loadbalanced in front of it to have less single point of failures)

As we only route some requests to the hhvm_api, we have the following in vcl_recv:

We can then later extend that URL matcher to include more requests.

The example of the author of the vmods uses req.http.X-Request-Backend for reporting which backend was used with statsd. Since with the fallback director defined above that is always hhvm_api_fallback, no matter which backend was used and we wanted to know when the fallback uses php_api instead of hhvm_api, we added the following to vcl_fetch:

Now the backend that is actually used is reported later (at least for misses, for hits it’s still hhvm_api_fallback, but that is ok, since it didn’t use any backend at all, wthough I’m sure this could be changed as well somehow with some header trickery in the vcl).

The librato backend seems also to interpret some stats differently than for example the default graphite backend in statsd. Therefore we used the the following lines in vcl_deliver for sending the values to statsd:

We basically added .count and .time to the key as librato gets confuesd if you use the same key for counters and timers.

With this data we can now plot graphs for each backend and each request type in librato. We immediatly see, if there are problems (for example too many non-200 responses or if one backend is slower than expected). We also can compare hits vs misses and collect the load of the servers with collectd.

Logging slow requests

Having an overview about how fast your setup is, is one thing, but being able to know exactly which requests are slow is another one. For the PHP backend we have New Relic for that. It keeps track of slow requests with full call traces and eg. SQL queries and their performance. It’s very useful, also for error reporting. For HHVM nothing similar exists yet (to our knowledge at least) and we didn’t want to clutter the code with too many home-grown analysis calls. We decided that for now it’s enough to just know which calls were slow, to be able to debug them, if it happens too often. And this is possible with varnish and the timers vmod. All we added for this in vcl_deliver was

This always logs the used backend to the varnish logs. And if the response time was slower than 500ms, we also log it as SlowQuery to the logs.

Then we can use varnishncsa to get readable logs and write them to a file. We use the following command for that:

This is the usual logfile format of web servers with some info added at the end. If it was a hit or a miss, which backend was used (logged via the std.log command in the vcl) and how long it took (the %D parameter). Then we can filter to just log requests which have the SlowQuery tag and are for the hhvm_api backend (the others are recorded by New Relic, but we could of course do them here as well).

We then collect those with logstash and send them to our graylog2 server. From there we can do for example alerts or just analyse the requests and try to reproduce the slowness and make it faster.

It’s of course not as powerful as the full toolset of New Relic (or similar), but at least we know now, which requests are slow. And we can also use that for other backends, should the need arise.

If you have any input what could be improved or if you know about other tools to collect stats for HHVM (or other currently exotic systems), let us know. We’re eager to hear about them. (Btw, Javascript, browser-based solutions are out of question here, since the API backend just sends JSON or XML, no possibility to inject JavaScript).

Tags: , , ,

Reverse proxy cache invalidation service

Fellow Liiper Lukas writes about a “Reverse proxy cache invalidation service” for Varnish and Symfony2. We use that for a big news site, where caching and “on time” delivery of articles is of utmost importance. The service is not done yet, but if all goes well, it will be done with some of Liip’s innovation budget and consequently open sourced later.

Now go and read the whole thing, it’s a very interesting approach in my opinion.

Tags: ,

Webtuesday talk about Varnish

Yesterday, the still rocking Webtuesday was held in the gorgeous new Zurich offices of Namics with more than 50 people attending. The topic was “Lightning Talks” and one of them was held by Liiper Pierre Spring about how we used Varnish in a project for a client. The slides are now available at SlideShare, if you missed it or want to relive the experience.

Tags: ,