Wednesday, December 2, 2015

Analysis of encrypted databases with CryptDB


As part of a bachelor thesis we have taken a look at the latest version of CryptDB and compared its performance with a normal MySQL installation and adoption on different applications. In this blog post we would like to share our insights with you.
For further results and technical specifications please refer directly to the thesis 'Analysis of Encrypted Databases with CryptDB' that can be found at http://www.nds.rub.de/media/ei/arbeiten/2015/10/26/thesis.pdf.




Introduction to CryptDB

CryptDB was developed by a team around Raluca Ada Popa at the Massachusetts Institute of Technology and published in 2011. It works like a proxy and enables SQL-aware encryption.

CryptDB acting as a proxy and translating queries between the Application and DBMS.
This means that the application (web- or desktop application) sends their normal SQL queries to the CryptDB proxy server which translates them depending on the commands context. Here, a short example of a translation

Original Query ALTER TABLE test_with_cryptdb`.`test_table_1`
ADD COLUMN new_col VARCHAR(5);
Query with CryptDB ALTER TABLE `test_with_cryptdb`.`table_LEVBAQKKCK`
ADD COLUMN QMOKLUOCLIoEq VARBINARY(48),
ADD COLUMN JRKIXUPWBMoOrder BIGINT unsigned,
ADD COLUMN cdb_saltXXLFJXOBBX BIGINT(8) unsigned

CryptDB uses a technique called onions and layers, where different computational aspects (like equality or order) are achieved by encrypting the same data with different algorithm that reveal different properties. To follow the paradigm 'reveal only as much as necessary' CryptDB dynamically reencrypts data to a more revealing layer of encryption only when necessary and does so on the fly.
Under ideal conditions this means the following:
  • The application is not aware of the encryption ...
  • ... and thus requires only minimal changes of the connection information
  • The server holds only encrypted data...
  • ... but is still able to perform most of the SQL operators, thus putting the computational load where it was intended to be.
We mentioned that the server holds only encrypted data which is good, when a data breach occurs. Now you might interject that this only shifts the risk from the database server to the CryptDB server and you would be completely right. The real problem is that even though there are mathematical concepts out there, as of now, we have no way to perform completely homomorphic encryption in a sufficient amount of time. This means that currently we will always have a weak link in the chain.
With CryptDB we are able to keep a small proxy inside the trusted network environment and are able to leave the actual Database in a more exposed position.

This is especially interesting when we talk about Database-as-a-Service (DbaaS) which means to put the database in the cloud. Then, the whole database is in a potentially untrusted environment, where unknown database administrators have to have full access to the data records to do their job. Now with CryptDB we are able to place the CryptDB proxy inside our own 'trusted' network, while we can export the encrypted entries in the untrusted environment.
CryptDB was released in 2011 by the MIT and was subject to further work until 2014, when the development from the working group around Popa has stopped and the source code was released via git under GPLv3 and can be found at https://css.csail.mit.edu/cryptdb/

Analysis of CryptDB


We wanted to see whether CryptDB in its current state is usable for small to mid sized database installations. We looked at:
  • Performance: We performed a few benchmarks against a CryptDB setup and a normal MySQL installation to measure the performance overhead. 
  • Adapting applications: We tried to adapt a few popular open source web application including Wordpress, Joomla and Drupal to see if we could use them with CryptDB.

Performance

For the performance analysis we developed two general scenarios that should represent a small (100 rows) and a mid sized database (100.000 rows). We then utilized SysBench to perform the benchmark with the two scenarios and repeated it several times with different thread configurations. When we looked at the results we noticed some significant slow downs with the CryptDB setup (as seen in Fig. 1), as well as some memory management issues when operating with many rows. We also discovered some issues related to multithreading on small databases.


Adapting applications

When looking at the applications we tried to see if installing some widely-used web applications, like Wordpress or Joomla, works out of the box or whether there are any surprises in. Therefore we tried to install the application with CryptDB enabled and compared it against an installation that we have performed without CryptDB enabled. And we have indeed noticed some pitfalls that have to be circumvented to guarantee a satisfying admin/user experience. Among these problems where things such as length restricted key assignments [1], incompatible engine types[2] and mysqli driver problems[3] in php. In the thesis we describe some ways to fix problems or, where a fix is not possible, we tried to describe how to work around the problem if possible.

Summary

All in all we see great potential in CryptDB. It has unfortunately a lack of development it the last two years. CryptDB in its publicy available version is a research prototype and not intended for productive deployment. However there are several major companies, including Google and SAP, which are actively working on database structures that feature a similar approach. We covered them in the related work section of the thesis, as well as some security related papers, the latest of which stirred some controversy when it was released in September 2015.

Acknowledgements

This post was written by Michael Skiba and reviewed by Christian Mainka and Vladislav Mladenov.
Michael Skiba's Bachelor Thesis can be found at http://www.nds.rub.de/media/ei/arbeiten/2015/10/26/thesis.pdf.

Technical Details

Issue 1: Length restricted key assignments CREATE TABLE IF NOT EXISTS ‘wp_usermeta‘ (
‘meta_key‘ varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
PRIMARY KEY (‘umeta_id‘),
KEY ‘meta_key‘ (‘meta_key‘(191))
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci AUTO_INCREMENT=16 ;

Issue 2: Incopatible engine types CREATE TABLE IF NOT EXISTS ‘fdnag_finder_tokens‘ (
‘term‘ varchar(75) NOT NULL,
KEY ‘idx_word‘ (‘term‘),
) ENGINE=MEMORY DEFAULT CHARSET=utf8;

Issue 3: MySQLi driver problems =================================
QUERY: SELECT DATABASE()
unexpected packet type 22
=================================
QUERY:
unexpected packet type 23
=================================

Beliebte Posts