Cloudflare’s Key Ingredient for Serverless – Chrome V8 Javascript Engine

November 19, 2018

What is V8 and what does it do?

V8, also known as Chrome V8, is an open-source high performance JavaScript engine developed for Google Chrome and Chromium web browsers by Lars Bak at The Chromium Project in a farm in Bak’s native Denmark. It was named V8 as a playful reference to the type of engine you find in a muscle car. As it is open source, V8 has become a key technology for other contexts as well such as Node.js, Electron, NativeScript, Couchbase and MongoDB among others. It was first launched in 2008, at the same time as Chrome. It is written in C++ and can be embedded into any C++ application (one’s own function implementations can be used to add new features to JavaScript), but it can also run standalone.

What is a JavaScript engine? It is a program designed to convert JavaScript code into machine code or what is known as lower level code that microprocessors can understand. Different processors speak different languages, which interact with the hardware. The code written on them is called machine code. Code written on our computers is compiled (or converted) into machine code. High level computer languages such as JavaScript are abstracted from machine language. By comparison, C++ is much closer to the hardware thus significantly faster than high level languages like JavaScript. This gives C++ more features in comparison to the higher level languages as it deals directly with files and folders on the hard drive.

The Chome V8 engine implements WebAssembly and ECMAScript and runs on macOS 10.12+, Windows 7+ and Linux systems that use x64, IA-32, ARM, or MIPS processors.

What are V8 Isolates?

V8 Isolate represents an isolated instance of the V8 engine. Cloudflare describes Isolates as “lightweight contexts that group together variables with the code that mutates them”. Just one process can run large numbers of Isolates, switching between them seamlessly. Isolates enable the chance to run untrusted code from multiple customers within a single operating system process. V8 isolates have separate states. They are constructed to spin up extremely quickly and to not let one Isolate gain access to the memory of another.

Updates over the Last Decade

V8 has been continually improved and updated over the last decade of its life. Its performance benchmarks have gone up by four times over that period.

Some of the improvements include:

“Initially, V8 supported only two instruction sets. In the last 10 years the list of supported platforms reached eight: ia32, x64, ARM, ARM64, 32- and 64-bit MIPS, 64-bit PPC, and S390. V8’s build system migrated from SCons to GYP to GN. The project moved from Denmark to Germany, and now has engineers all over the world, including in London, Mountain View, and San Francisco, with contributors outside of Google from many more places. We’ve transformed our entire JavaScript compilation pipeline from unnamed components to Full-codegen (a baseline compiler) and Crankshaft (an feedback-driven optimizing compiler) to Ignition (an interpreter) and TurboFan (a better feedback-driven optimizing compiler). V8 went from being “just” a JavaScript engine to also supporting WebAssembly. The JavaScript language itself evolved from ECMAScript 3 to ES2018; the latest V8 even implements post-ES2018 features.”

The Latest Version: V8 7.1

As of earlier this month, the beta version of V8 is available, V8 7.1.

V8 7.1 in beta demonstrates memory and performance improvements in addition to enhancements for JavaScript and the WebAssembly binary format.

In terms of memory, bytecodes for the interpreter are newly embedded into the binary, which represents a saving of around 200KB per isolate. In terms of boosting performance, the TurboFan compiler’s escape analysis has been enhanced to tackle local function contexts for higher order functions, when variables from the neighboring context escape to a local closure. In relation to escape analysis, scalar replacement is undertaken for objects local to an optimization unit.

Some key features are now enabled in webAssembly; including:

for WebAssembly’s bytecode format, postMessage is supported for all modules, which is scoped to web workers, but not yet extended to cross-process scenarios
An early preview of WebAssembly Threads, enabled by a feature flag: chrome://flags/#enable-webassembly-threads

The last major update relates to JavaScript. The newly available RelativeTimeformat API enables the formatting of relative times at a local level, such as “yesterday” without undercutting performance. It supports the GlobalThis proposal, offering a universal mechanism for accessing the global object even in modules or strict functions, irrespective of the underlying platform.

How does Cloudflare use V8?

Cloudflare began to develop Workers when faced with a problem. They wanted their customers to be able to write code and build applications themselves as Cloudflare was limited in the number of features and options they could build internally. The goal was to find a solution that allowed customers to write code on Cloudflare’s servers deployed worldwide (then 100, now over 150). It had to run extremely quickly. Cloudflare processes millions and millions of requests per seconds, and sits in front of over ten million sites. They previously used Lua, but as it didn’t run in a sandbox, customers weren’t able to run code independently. To use traditional virtualization and container technologies such as Kubernetes would have been too expensive and too resource intensive. Eventually, Cloudflare settled on V8 Isolates, which are built to start very quickly. A single process can run hundreds or thousands of Isolates, switching between them seamlessly. This means that code can be run from many different customers in the same operating system process. They consume far less memory than other similar systems and they don’t use a virtual machine or a container, meaning that you running much closer to the metal than most other forms of cloud computing.

The Difference with Traditional Serverless

In a blog post on Cloudflare’s use of V8, Director of Product for Product Strategy, Zack Bloom says he believes “it’s possible with this model to get close to the economics of running code on bare metal, but in an entirely Serverless environment”. Bloom says this marks not just “iterative improvement but an actual paradigm shift”.

Traditional serverless platforms like Lambda work by spinning up a containerized process for code. Rather than running your code in a lightweight environment, instead it autoscales the containerized process, which creates cold-starts. A cold start is what occurs when you need to start a new copy of your code on a machine. In Lambda, this means spinning up a new containerized process can last between 500 milliseconds and 10 seconds. Requests that last up to ten seconds can lead to a bad user experience, and worse, as a Lambda is only able to process one request at a time, each time there is an additional concurrent request, a new Lambda needs to be cold-started. The user experience can worse as that long request is repeated over and over. Alternatively, if Lambda doesn’t get a request quickly enough, it will be shut down and the process will begin again. Whenever new code is deployed, every Lambda has to be redeployed also and the whole process happens again.

By contrast, Workers doesn’t need to start up a new process each time. Isolates begin in 5 milliseconds, which is almost imperceptible. Isolates scale and deploy as quickly as Lambda, eliminating the issue of cold-starts and laggy requests.

Context Switches

All operating systems allow you to run multiple processes at once. It will switch between the different processes that want to run code at a given time. It does this via a ‘context switch’, moving the memory needed for one process out, and the memory next required in. This can take as much as 100 microseconds. This creates a heavy overhead when multiplied by all the Node, Go or Python processes running on an average Lambda server, which means some of the CPUs power is spent switching between customer code rather than just running it.

The isolate-based system, by contrast, runs all of the code in one process and calls on its own mechanisms to maintain safe memory access. There are no expensive context switches then and the machine can spend most of its time running your code.

Memory and Cost in Multi-tenant Systems

A basic Node Lambda not running real code consumes 35MB of memory (it was built to run on a single server, not in a multi-tenant environment with strict memory needs); sharing the runtime between Isolates, however, means only around 3MB of memory is consumed. Memory is typically the highest code of running customer code (even more so than the CPU). Thus lowering it significantly in this way can dramatically impact the economics. V8 was built to be multi-tenant and was designed to run the code from all the tabs in your browser in isolated environments within a single process.

Lambdas are billed based on the length of time they run for. Billing is rounded up to the nearest 100 milliseconds, which can lead to overpaying for customers, particularly as you pay for the time it takes for an external request to complete which in a multi-tenant system at scale can be significant. As Isolates have a far smaller memory footprint, Cloudflare bills its customers for the time only when code is actually executing as opposed to running. Cloudflare claims that Workers can translate to being 3x cheaper per CPU-cycle. A Worker that offers 50 milliseconds of CPU is $0.50 per million requests. According to Cloudflare, the equivalent for Lambda would be $1.84 per million.

Security Issues

Running code simultaneously within the same process requires attention to security. For Cloudflare, building an isolation layer would have been far too expensive. In Bloom’s words, “The only reason this was possible at all is the open-source nature of V8, and its standing as perhaps the most well security tested piece of software on earth. We also have a few layers of security built on our end, including various protections against timing attacks, but V8 is the real wonder that makes this compute model possible”.

The Limitations of V8

One of the challenges Cloudflare has found with V8 is the fact that an Isolate-based system is unable to run arbitrary compiled code. Process-level isolation means your Lambda meanwhile can spin up any binary it needs. In Isolates, you either need to write your code in JavaScript or a language which targets WebAssembly such as Go or Rust. Also, if you are unable to recompile your processes, you can’t run them in an Isolate. Bloom says, “This might mean Isolate-based Serverless is only for newer, more modern, applications in the immediate future. It also might mean legacy applications get only their most latency-sensitive components moved into an Isolate initially.” However, the Isolates community may find ways to solve this and transpile existing applications into WebAssembly.