Every system has its weak point. Remember when Luke Skywalker bulls-eyed a small thermal exhaust port with proton torpedos, causing a chain reaction that blew up the Death Star? Okay, it’s an extreme (and fictitious) example. However it reminds us that we must be vigilant about protecting small parts of our IT infrastructure, including the cache.

From the 25 May 2018, all companies doing business with Europe need to comply with the General Data Protection Regulation (GDPR). Those that don’t run the risk of being fined up to four percent of annual revenue. Protecting people’s personal data, handling it with respect and care, is central to the new legislation. Companies are also strongly encouraged to encrypt personal data to protect it from falling into the wrong hands. The latter is becoming all too commonplace.

According to Gemalto’s Breach Level Index, more than ten million records were compromised or exposed every day during the first six months of 2017 alone. According to the Index, less than one percent of the stolen, lost or compromised data used encryption. This disturbing fact explains why GDPR advocates encryption, which renders leaked data useless.

Even when organisations tackle security and apply encryption, many routinely fail to protect one small “thermal port”: the cache. This puts them in grave danger because striking a cache actually turns out to be far easier than attacking the Death Star.

Caching is about creating a temporary storage area (the cache) to store information so that a website or application does not have to keep re-accessing it from the company’s backend server farm.

Caches are designed to store large amounts of data in a tightly packed space with easy access. They are designed for performance and scalability. Some of the web caches use defensive coding practices, secure design and maintain the integrity of the cache. But in general, security has, for some time, been excluded from cache design principles.

For that reason, cache leaks have become a known and common vulnerability. Take for instance the Cloudbleed and Meltdown security breaches. In the case of Cloudbleed, a bug in a content delivery network (CDN) provider’s cache caused it to leak of all its data, including private information like messages from major dating sites. In Meltdown, the cache isolation between processes was violated. This allowed a rogue process to read the cache’s entire memory without requiring authorisation.

Preventing cache leaks with encryption

Applying complete encryption to every cache object is one way to prevent these leaks and there are technical solutions that offer that type of encryption. They assign each cache object its own unique encryption key following the Advanced Encryption Standard (AES). Also each request – like a visitor that wants to open a piece of content – is assigned one key.

The key is based on the unique fingerprint of that request, ensuring that the visitor only has access to the very specific piece of content held in the cache.

Even if data objects held in cache would leak (as in the Cloudbleed case), they wouldn’t be useful. Objects would appear as undecipherable garbage, requiring a unique key to decrypt them.

If any part of a cache would become visible (as in the Meltdown case), encryption prevents them from being read. Each cache object is uniquely encrypted using a key derived from data not stored with the object. Basically, leaking the cache would require breaking the AES encryption for each and every object in cache. That is a very difficult task.

TLS support

In addition to the “complete encryption” just described, a cache should also support partial encryptions like Transport Layer Security (TLS). TLS is a cryptographic protocol for communications security used, for example, by websites to protect communications between servers and web browsers. These can be used to protect both frontend and backend.

On the frontend, the client server on the edge intercepts requests by, for example, website visitors before they reach the company server. In this scenario, TLS support enables the encryption of traffic between the edge and the cache server.

On the backend, the cache server fetches missing content from the backend server farm. This enables content to be fetched over the encrypted TLS, which particularly benefits organizations that run fully encrypted data centres or have web servers that reside in different locations to their cache servers. It’s particularly important to encrypt the data exchange between caching nodes and the backend servers when caches are distributed around the world to serve different regions.

Minimising GDPR-related vulnerabilities

GDPR mandates that when security breaches occur that affect the integrity, confidentiality or security of personal data, organisations must notify their local data protection authority. However, it’s only necessary to report breaches of personal data when they could result in identity theft, fraud or other types of damage. Organisations that have implemented cache encryption won’t have to worry about this since stolen data will be useless to third parties.

For those doing your homework and preparing for GDPR, remember not to let the cache become your weak point. For the rest, may the force be with you (and your personal data)!

 By Arianna Aondio, Technical Evangelist at Varnish Software