Interview with Per Buer, Founder of Varnish Software #3

It looks like the Varnish team is growing, please tell us more?

Of course. Our team is close to 40 people now, and we’re fairly distributed. I’m excited at how quickly our team is growing. We’re growing at a moderate pace, since it takes time to find good people who are a good fit for our organization. We’re a truly global company, and the number of nationalities is growing steadily, which makes every day interesting.

Previously you mentioned that Varnish had some challenges working in a CDN environment. Has that changed? If yes, what did you do?

The problem we’ve faced earlier was Varnish’s limited ability to cache large data sets. With Massive Storage Engine (MSE) we’ve overcome this limitation making Varnish Cache Plus a suitable engine to drive CDNs. The flexibility of VCL makes adding rules at the edge simple and gives the CDN the possibility of adding unique business logic.

We’re now at the point in time where we are confident that Varnish can power a large CDN. However, it’s still not a complete package where it can be a CDN all by itself. You still need some external services like geo-DNS to route clients to their closest CDN node, but thankfully these are pretty much a commodity. So with services like Dyn and Amazons Route 53, you can get your CDN up and running within a matter of hours. If you plan on having external users, then billing and metering capabilities are required. It’s still an open question whether Varnish Software should pursue that particular market or not.

But, as of today, we absolutely consider Varnish Cache Plus CDN-ready.

Before we dive into the cool new features you’ve rolled over the last six months, can you share with us some of the improvements you’ve made in the basic caching platform?

Varnish Cache 4.1 is being prepared for release. The major new features are:

Varnish jails – a new security model
This adds a new privilege separation API to take better advantage of the possibilities of the different operating systems we support. As a result, Varnish is now able to ensure that even if one were to take control of the worker, they would not be able to get any important runtime information from that (e.g. the secret content for CLI control of the master).

Backend as a vmod
It should now be possible to implement a backend (something that takes a set of headers and possibly a body, and returns a response as a set of headers and a body). This will enable both dynamic backends and TLS against origin to be implemented as vmods. Finally we’ll have proper dynamic backend and it will make life easier on EC2.

Support for the PROXY protocol
Varnish Cache 4.1 has full support for the PROXY protocol. This allows tight integration between Varnish Cache and SSL proxies. Although SSL support probably won’t make it into Varnish Cache itself, Varnish Cache Plus will have full SSL support soon (launch scheduled in June). For those who want to use the open source software, it will be quite simple to integrate Varnish with an SSL proxy.

What is Varnish High Availability?

Varnish High Availability is a system for replicating objects between caches. Replicating content between the caches gives two distinct advantages. One is that you get a performance boost because Varnish will replicate the object with a lot less effort than fetching it twice from the backend. Most Varnish installations are performance bound by their origin services and in these scenarios we’ve seen big performance increases – in some installations up to 300%. Granted, these are synthetic benchmarks.

The second advantage is resilience. If all your objects are available on all your caches, losing a cache won’t have such a detrimental effect on performance.

I’ve worked in operations for over ten years and my experience with high availability products hasn’t been stellar. A lot of the products I’ve worked with have been rather complex and most of them has caused more downtime than what I otherwise would have experienced. Complexity breeds downtime like nothing else and hence I’ve become a sucker for simplicity.

We had this in mind when building Varnish High Availability. If it should fail, it will not bring Varnish down with it. The worst thing that can happen when using our High Availability is that replication will stop and you’ll experience a small performance impact.

What is Varnish Massive Storage Engine?

It is a storage engine for Varnish Cache Plus. The main benefit it brings is allowing Varnish to cache a lot more data than possible with Varnish Cache. We’ve seen it handle 100TB on each server without breaking a sweat. This makes it possible to use Varnish to cache video content, both live and VOD. In addition, it makes Varnish a perfect fit for a CDN.

The reason MSE is able to handle these large data sets is mostly due to better interaction between Varnish and the underlying operating system. The way we’ve used memory maps has made the code simple, but we’ve learned that writing through memory maps isn’t always good for performance, so making sure we write directly to disk without going through the memory maps has increased our performance quite a bit.

In addition, we’ve done quite a bit of research on how to deal with fragmentation. Fighting fragmentation is hard and there is limited research available on the topic of fragmentation in the cache. The main lesson we’ve learned is that it can be beneficial for a cache to discard certain objects from cache, in order to reduce fragmentation. This approach is very specific to a cache and not something a file system or a database can do. In a cache, performance is everything and sacrificing an object now and then makes perfect sense if it helps keep performance up.

What is Varnish Tuner?

Tuning Linux for high performance HTTP workloads is by many considered to be somewhat of a black art. Before launching this product, we spent a lot of time maintaining documentation on how to tune the kernel. It turns out that a lot of this documentation could be very well expressed in a computer program, so for us it made perfect sense to take our documentation, and rework it into a program that can be run on a Varnish instance and have it give advice on how to tune that particular instance.

I’ve put up a short demo on Youtube showcasing this.

What is one of the most complex implementations the Varnish engineering team has been involved in and what problems were solved?

While working on the Varnish Massive Storage Engine, we ran into many challenges and complexities. In order to increase the write capacity to storage, we first had to build up a thorough understanding of the inner workings of the host OS’ virtual memory implementation and its behavior. From this we learned that by aligning memory optimally for the VM by the application, and only handing off buffers of the right size to match the underlying storage device, one can prevent many of the slow paths of the VM subsystem being invoked.

With this knowledge in mind, the way Varnish handles itself when fetching content had to be reworked. Varnish originally used a direct approach to memory writes through a memory map. To peruse our VM tricks, this needed to be changed into a double buffer scheme. But since the storage engines in Varnish are pluggable, and the traditional storage engines would not benefit from the buffering scheme, the internal Varnish APIs against the storage engines had to be completely reworked in a way which the performance of the traditional approach would not degrade, while still allowing for the other approach to run at its full potential. Adding to the complexity the exception points shifts with the double buffering scheme, adding new error handling paths.

The size requirements of storage used with MSE added a lot of complexity to the data structures for the allocations and free store. To be efficient we borrowed ideas from file system implementations such as extent maps, really implementing a new file system internal to Varnish. Especially important here was to create something that will lend itself to transaction boundaries when we add persistence.

For the defragmentation / invalidation scheme, we had a major challenge in creating a new way of allowing a subsystem (MSE in this case) to get hold of objects safely. Varnish’s ultra fast content lookup and delivery path is achieved through some highly tuned call paths with a complex locking system, and coming at this locking system from “the other side” safely was not an easy task. Especially to avoid causing any more locks and delays in the fast delivery path.

Do you have a Professional Services team?

We don’t have a professional services team. We rely on our network of integration partners to do the heavy lifting on installation projects. If we were to offer those services, we would quickly find ourselves in a traditional channel conflict scenario.

Currently, the most interesting project we’re working on is a major CDN deployment. At this point, we’re not able to name any names but it’s a well known company. Until now our CDN deployments have been private CDNs, and taking the leap into a fully fledged public CDN is very exciting for us.