Cipher benchmark for dm-crypt / LUKS
Do you have a netbook, laptop, desktop or a server which uses dm-crypt to encrypt data on your disks? If yes, you will probably find that raw hard disk performance is better than encrypted disk performance. You will notice that especially on slow machines (i.e. netbooks), but also high-performance servers, because of the current dm-crypt design.
What cipher in the Linux kernel provides you with the best performance?
Currently, dm-crypt in the Linux kernel suffers from at least one performance-wise flaw: it is not SMP aware. This means, even if you have several CPUs in your machine, only one processor will be used to encrypt/decrypt data (ed. 31-May-2010: there was a patch posted today to make dm-crypt scale to multiple CPUs).
With moderately fast disks and RAID arrays in a server, you will hit a a limit where one processor is not able to encrypt/decrypt data fast enough. With netbooks and slow CPUs, and probably fast SSD disks, you will hit this limit even earlier.
Here is a list of different ciphers and throughput they delivered, when reading from a given device linearly.
The tests were made on a Celeron 2.93GHz CPU with Seagate Barracuda 7200.11 SATA 3Gb/s 1.5-TB ST31500341AS disks. Raw linear speed of these disks was about 105 MB/s.
What performance could different ciphers deliver on this machine? Note that you have to consider security implications / encryption strength yourself when using custom encryption schemes (i.e. using -essiv instead of -plain on similar hardware will usually decrease the performance by about 10 MB/s, but your encryption should be “harder to crack”).
Default parameters for creating an encrypted device are:
cryptsetup luksFormat /dev/$DEVICE
You can add options like:
cryptsetup luksFormat -c cast5-cbc-plain -s 128 /dev/$DEVICE
To open an encrypted device:
cryptsetup luksOpen /dev/$DEVICE $SOMENAME
You will have a new block device in /dev/mapper/$SOMENAME, which you can i.e. use for a filesystem
To close the encrypted device:
cryptsetup luksClose $SOMENAME
Below, the results:
-c tnepres 20.1 MB/s
-c serpent 20.4 MB/s
-c seed-ecb-plain -s 256 20.5 MB/s
-c fcrypt-pcbc-plain -s 64 30.4 MB/s
-c khazad-ecb-plain -s 128 31.7 MB/s
-c xtea-ecb-plain -s 128 32.0 MB/s
-c arc4 32.1 MB/s
-c xeta-ecb-plain -s 128 32.1 MB/s
-c twofish 34.2 MB/s
-c anubis-cbc-plain -s 256 37.5 MB/s
-c anubis -s 256 37.8 MB/s
-c tea-ecb-plain -s 128 38.1 MB/s
-c anubis-ecb-plain -s 256 39.6 MB/s
-c cast6-cbc-plain -s 256 40.0 MB/s
-c cast6 40.7 MB/s
-c des-ecb-plain -s 64 42.0 MB/s
-c camellia -s 256 42.2 MB/s
-c anubis -s 128 46.4 MB/s
-c anubis-cbc-plain -s 128 47.5 MB/s
-c anubis-ecb-plain -s 128 49.4 MB/s
-c cast5-cbc-plain -s 128 50.2 MB/s
-c camellia -s 128 51.4 MB/s
-c aes -s 256 55.9 MB/s
-c aes-cbc-plain -s 256 56.4 MB/s
-c aes-cbc-benbi -s 256 56.7 MB/s
-c aes-cbc-null -s 256 57.0 MB/s
-c blowfish 57.2 MB/s
-c aes-ecb-benbi -s 256 58.8 MB/s
-c aes-ecb-null -s 256 59.5 MB/s
-c aes-ecb-plain -s 256 60.3 MB/s
-c blowfish-ecb-plain 61.4 MB/s
-c aes-xts-plain -s 256 61.6 MB/s
-c aes-lrw-plain -s 256 62.8 MB/s
-c aes-cbc-plain -s 128 66.8 MB/s
-c aes-ctr-plain -s 128 67.0 MB/s
-c aes-cbc-null -s 128 67.1 MB/s
-c aes-cbc-benbi -s 128 67.4 MB/s
-c aes -s 128 67.5 MB/s
-c aes-ecb-plain -s 128 71.0 MB/s
-c aes-ecb-benbi -s 128 71.2 MB/s
-c aes-ecb-null -s 128 71.5 MB/s
The benchmarks were made with dd (bs=64k, 3 GB read), repeated several times; caches were dropped before each test.
I am missing ESSIV mode (with AES for instance). Did you not covered it by intention?
I was looking for the fastest mode for my machine.
-essiv on the above hardware (2.9 GHz Celeron) performed about 10 MB/s slower than -plain (you will find this note in the article as well).
My results on a Intel Atom 330 with loopback device on RAM. The script can be downloaded fom http://www.holtznet.de/luks/
Create options write read
-c aes-cbc-essiv:sha256 -s 128 29.2 MB/s 31.7 MB/s
-c aes-xts-plain -s 128 28.3 MB/s 31.7 MB/s
-c aes-cbc-essiv:sha256 -s 196 28.1 MB/s 32.3 MB/s
-c aes-xts-plain -s 196 28.6 MB/s 31.9 MB/s
-c aes-cbc-essiv:sha256 -s 256 23.2 MB/s 25.1 MB/s
-c aes-xts-plain -s 256 31.2 MB/s 33.9 MB/s
-c arc4-cbc-essiv:sha256 -s arc4 30.2 MB/s 33.6 MB/s
-c arc4-xts-plain -s arc4 31.8 MB/s 34.4 MB/s
-c des-cbc-essiv:sha256 -s 128 31.0 MB/s 34.0 MB/s
-c des-xts-plain -s 128 31.5 MB/s 33.8 MB/s
-c des-cbc-essiv:sha256 -s 256 32.1 MB/s 34.0 MB/s
-c des-xts-plain -s 256 30.4 MB/s 34.2 MB/s
-c blowfish-cbc-essiv:sha256 -s 128 22.8 MB/s 27.5 MB/s
-c blowfish-xts-plain -s 128 23.1 MB/s 27.1 MB/s
-c blowfish-cbc-essiv:sha256 -s 196 22.8 MB/s 27.6 MB/s
-c blowfish-xts-plain -s 196 23.0 MB/s 27.2 MB/s
-c blowfish-cbc-essiv:sha256 -s 256 23.4 MB/s 27.6 MB/s
-c blowfish-xts-plain -s 256 22.7 MB/s 27.4 MB/s
-c anubis-cbc-essiv:sha256 -s 128 20.6 MB/s 22.6 MB/s
-c anubis-xts-plain -s 128 20.9 MB/s 22.9 MB/s
-c anubis-cbc-essiv:sha256 -s 256 17.1 MB/s 18.5 MB/s
-c anubis-xts-plain -s 256 21.9 MB/s 23.6 MB/s
-c cast5-cbc-essiv:sha256 -s 128 22.5 MB/s 23.5 MB/s
-c cast5-xts-plain -s 128 21.8 MB/s 23.6 MB/s
-c camellia-cbc-essiv:sha256 -s 128 20.6 MB/s 19.2 MB/s
-c camellia-xts-plain -s 128 20.4 MB/s 19.2 MB/s
-c camellia-cbc-essiv:sha256 -s 196 20.7 MB/s 19.5 MB/s
-c camellia-xts-plain -s 196 20.9 MB/s 19.4 MB/s
-c camellia-cbc-essiv:sha256 -s 256 16.6 MB/s 15.2 MB/s
-c camellia-xts-plain -s 256 22.2 MB/s 20.0 MB/s
-c twofish-cbc-essiv:sha256 -s 128 22.9 MB/s 26.2 MB/s
-c twofish-xts-plain -s 128 23.2 MB/s 26.2 MB/s
-c twofish-cbc-essiv:sha256 -s 196 23.6 MB/s 26.5 MB/s
-c twofish-xts-plain -s 196 23.3 MB/s 26.4 MB/s
-c twofish-cbc-essiv:sha256 -s 256 23.5 MB/s 26.6 MB/s
-c twofish-xts-plain -s 256 24.7 MB/s 27.1 MB/s
-c salsa20-cbc-essiv:sha256 -s 128 24.5 MB/s 27.6 MB/s
-c salsa20-xts-plain -s 128 24.8 MB/s 27.5 MB/s
-c salsa20-cbc-essiv:sha256 -s 160 25.3 MB/s 27.5 MB/s
-c salsa20-xts-plain -s 160 25.0 MB/s 27.5 MB/s
-c salsa20-cbc-essiv:sha256 -s 196 24.5 MB/s 27.7 MB/s
-c salsa20-xts-plain -s 196 24.7 MB/s 27.5 MB/s
-c salsa20-cbc-essiv:sha256 -s 256 24.7 MB/s 27.3 MB/s
-c salsa20-xts-plain -s 256 24.9 MB/s 27.3 MB/s
-c serpent-cbc-essiv:sha256 -s 128 22.3 MB/s 25.2 MB/s
-c serpent-xts-plain -s 128 23.0 MB/s 25.6 MB/s
-c serpent-cbc-essiv:sha256 -s 196 23.5 MB/s 26.2 MB/s
-c serpent-xts-plain -s 196 24.3 MB/s 26.2 MB/s
-c serpent-cbc-essiv:sha256 -s 256 22.9 MB/s 24.5 MB/s
-c serpent-xts-plain -s 256 23.9 MB/s 26.1 MB/s
@Frank: “-c aes-xts-plain -s 128″ can’t work though. The key needs to be at least 256 bit long. The kernel Kconfig help suggests that for XTS you need to double the key size (so aes-xts-plain: 256/384/512 bits). That’s neccessary because one part of the key is used by XTS and the other by AES. It’s also strange that the bigger the keysize gets the more read/write throughput you get.
Seems a flawed post, the most important benchmark was missing and thats the one with the default options (without -c set), so now I am sitting here wondering where does the default option sit in that league table.
@Chris: The default mode is aes-cbc-essiv:sha256 for LUKS with a 128-bit key, it’s documented in the cryptsetup man page.
To print a table with 3 digit precision and then say that the default (essiv) will be “about 10 MB/s” slower than something somewhere in the table makes no sense. If anything is worth measuring precisely for comparison’s sake, it’s the default. And if anything is worth listing explicitly in the table, it’s the default.
Another topic: Multi-core CPU measurements would be nice.
Jim,
it’s about “10 MB/s slower on similar hardware” and was consistent with my tests.
Multi-core CPU measurement would make no difference at all, as dm-crypt is not SMP-capable.
I don’t doubt that you measured 10 MB/s difference, but that’s 1 digit precision. You listed everything else with 3 digits, so this should be too.
You have an authoritative source that it is not able to use multiple cores? Because Fruwith even fought with the kernel devel guys to restructure the kernel so that luks could run LRW (hence parallel). And that was 5 years ago.
Well, you’ll find some more tests by Frank, if you’re interested in more results.
You can always make tests yourself if you want to know how it behaves on your hardware.
dm-crypt does not use more than one core; I gave a link to a discussion with crypt maintainers on dm-devel list, which is roughly one year old. Nothing changed here, AFAIK.
[...] [...]
I tested about 1200 combinations of cipher, cipher mode, iv hash, and key length based on the available blkciphers, hashes, and modes in /proc/crypto on an OpenSuSE 11.2 system. Each test was run ten times on a one gig, memory backed loopback device. I skipped a number of irrelevant configurations (tnepres, xeta, arc4, essiv hashes for xts modes, etc.) but a few weird setups still make the list (michael_mic for an essiv hash and every possible key length for blowfish, for example.)
All tests were run on a system with a single dual core Xeon 5160 processor and 4 GB of RAM. Caches were dropped before each test.
http://skroz-www.s3.amazonaws.com/report.csv
I think I see the problem with Frank’s benchmark results.
In his test script, he runs “cryptsetup … luksFormat …” without checking for an error. If a previous iteration left a valid luks device AND the failed luksFormat left loop0 unmodified, the subsequent luksOpen would succeed and the benchmark would run for the previous cipher-mode-iv combination. This appears to explain why there are valid results for impossible modes such as “-c aes-xts-plain -s 128″ (128 is an invalid blocksize for aes-xts), “-c blowfish-xts-X” and “cast5-xts-X” (blowfish and cast5 have 8 byte block size and won’t work in XTS mode at all), anything with salsa20 (salsa20 is a stream cipher), and “-c arc4-xts-plain -s arc4″ (interesting block size, that.)
[...] the encryption I decided to utilize aes-ebc-plain with key size 128 bits for its speed (I don’t need military grade safety). Partitions [...]
Are there any news concerning benchmarks and dm-crypt since the last comments were posted on this blog about a year ago? In particular, how is the multi-CPU-support coming that you mentioned on May 31 last year?
By the way, I really appreciate this site, even if my previous comments may have sounded quite critical. Keep up the good work!
Multi-CPU-support was added in 2.6.38 (look for “dm-crypt: scale to multiple cpu”):
http://kernelnewbies.org/Linux_2_6_38
Thanks. Now I just have to wait until the distro vendors actually ship kernel 2.6.38. Is there any chance to apply the patch to older kernels (e.g. 2.6.32?)
@Jim: I don’t think so, at least not without some really major effort.
It would be easier to compile your own kernel, or use 2.6.38 provided by your distribution (developer build etc.).
Shecky, you’re a legend. Cheers for the .csv!
It’s certainly confusing how 512-bit xts is better compared to 256-bit cbc, especially when the double key length requirement seems to be a mistake by the spec writers :/
Even though ECB is a speed demon in terms of encryption, you should strongly avoid using it. https://secure.wikimedia.org/wikipedia/en/wiki/Block_cipher_modes_of_operation#Electronic_codebook_.28ECB.29 explains why, and gives a good demonstration. And because the first parts of the disk clearly show the algorithm, key size and hash used, it shouldn’t even be an option, IMO.
Check out the first part of an aes-ebc-null disk with a 128 byte key:
Hi,
I ran a few tests on my Atom D525 (dual core 1.8 GHz) NAS box. It is running Ubuntu 11.04, Linux 2.6.38-8. There’s 4 GB of RAM, of which I used ~1 GB to create a ramdisk for the tests.
The script is at http://www.salokanto.fi/linux/test-crypt-perf.sh. It’s not pretty, but you can modify it for your needs.
Results (in MiB/s):
Options Write Read
-c aes-xts-plain -s 256 45.53 43.89
-c aes-cbc-essiv:sha256 -s 128 44.89 45.33
-c aes-xts-plain -s 384 46.55 38.61
-c aes-cbc-essiv:sha256 -s 192 33.56 39.07
-c aes-xts-plain -s 512 41.11 36.15
-c aes-cbc-essiv:sha256 -s 256 31.01 36.13
My understanding is that xts-plain must use double the key length compared to that of aes-cbc-essiv:sha256 in order to attain similar strength.
CentOS 6 (2.6.32-71.29.1.el6.x86_64) with 16GB RAM (RAM-disk 11GB, file 10GB)
– VM (only single)on ESXi5, 2 x Xeon E5530 48GB RAM
Script from Heikki Salokanto, modifyed (error-control)
Options (mode) Write Read
default (aes-cbc-essiv:sha256 256) 82.77 102.76
-c aes (cbc-plain 256) 82.00 105.79
-c aes -s 128 (cbc-plain) 104.91 139.99
-c aes-ecb -s 128 121.56 151.59
-c aes-cbc-null -s 128 100.85 139.72
-c aes-cbc-plain -s 128 100.97 139.70
-c aes-cbc-benbi -s 128 100.71 139.66
-c aes-cbc-essiv:sha256 -s 128 98.02 134.77
-c aes-pcbc-null -s 128 95.89 105.71
-c aes-pcbc-plain -s 128 98.59 105.63
-c aes-pcbc-benbi -s 128 94.13 105.62
-c aes-pcbc-essiv:sha256 -s 128 95.51 102.86
-c aes-ctr-plain -s 128 111.94 137.34
-c aes -s 256 (cbc-plain) 83.59 105.71
-c aes-ecb -s 256 96.98 112.92
-c aes-cbc-null -s 256 84.46 105.69
-c aes-cbc-plain -s 256 84.22 105.70
-c aes-cbc-plain64 -s 256 80.44 105.73
-c aes-cbc-benbi -s 256 81.42 105.47
-c aes-cbc-essiv:sha256 -s 256 78.61 102.98
-c aes-pcbc-null -s 256 80.89 84.52
-c aes-pcbc-plain -s 256 80.37 84.62
-c aes-pcbc-plain64 -s 256 76.38 84.52
-c aes-pcbc-benbi -s 256 79.83 84.62
-c aes-pcbc-essiv:sha256 -s 256 75.48 82.71
-c aes-lrw-null -s 256 (128) 108.64 132.97
-c aes-lrw-plain -s 256 (128) 111.43 132.99
-c aes-lrw-plain64 -s 256 (128) 111.95 133.11
-c aes-lrw-benbi -s 256 (128) 111.83 133.26
-c aes-lrw-essiv:sha256 -s 256 (128) 104.83 127.86
-c aes-lrw-null -s 384 (256) 82.58 102.08
-c aes-lrw-plain -s 384 (256) 86.82 101.99
-c aes-lrw-plain64 -s 384 (256) 82.12 101.76
-c aes-lrw-benbi -s 384 (256) 86.64 102.04
-c aes-lrw-essiv:sha256 -s 384 (256) 84.30 98.58
-c aes-xts-null -s 256 (128) 112.00 137.21
-c aes-xts-plain -s 256 (128) 111.06 137.08
-c aes-xts-plain64 -s 256 (128) 109.64 136.95
-c aes-xts-benbi -s 256 (128) 113.97 137.12
-c aes-xts-essiv:sha256 -s 256 (128) 104.28 132.15
-c aes-xts-null -s 512 (256) 85.78 104.07
-c aes-xts-plain -s 512 (256) 88.78 103.80
-c aes-xts-plain64 -s 512 (256) 87.89 103.86
-c aes-xts-benbi -s 512 (256) 90.01 103.76
-c aes-xts-essiv:sha256 -s 512 (256) 85.81 101.25
-c aes-ctr-null -s 256 86.65 103.91
-c aes-ctr-plain -s 256 86.33 103.99
-c aes-ctr-plain64 -s 256 88.17 104.15
-c aes-ctr-benbi -s 256 85.65 103.93
-c aes-ctr-essiv:sha256 -s 256 87.09 101.47
-c des-ecb-plain -s 64 44.21 48.53
-c anubis -s 128 (cbc-plain) 84.57 106.66
-c anubis-ecb-plain -s 128 92.04 114.43
-c anubis-cbc-plain -s 128 79.48 104.86
-c anubis -s 256 (cbc-plain) 67.69 84.64
-c anubis-ecb-plain -s 256 75.93 90.33
-c anubis-cbc-plain -s 256 66.87 83.70
-c blowfish (cbc-plain 256) 62.40 85.50
-c blowfish-ecb-plain (256) 72.29 92.52
-c blowfish-cbc-plain (256) 61.70 85.49
-c blowfish-cbc-essiv:sha256 -s 256 63.32 84.16
-c twofish (cbc-plain 256) 88.63 116.51
-c twofish-cbc-essiv:sha256 -s 256 86.34 113.89
-c twofish-lrw-essiv:sha256 -s 256 (128) 84.76 109.01
-c twofish-xts-essiv:sha256 -s 256 (128) 88.70 111.03
-c twofish-xts-essiv:sha256 -s 512 (256) 91.41 111.04
-c camellia -s 128 (cbc-plain) 76.31 97.80
-c camellia -s 256 (cbc-plain) 61.67 75.75
-c cast5-cbc-plain -s 128 58.21 77.55
-c cast6 (cbc-plain 256) 49.44 55.25
-c cast6-cbc-plain -s 256 48.71 55.37
-c tea-ecb-plain -s 128 36.55 37.92
-c xtea-ecb-plain -s 128 38.67 42.21
-c tnepres (cbc-plain 256) 41.03 48.71
-c serpent (cbc-plain 256) 42.92 51.05
-c khazad-ecb-plain -s 128 95.62 115.29
-c xeta-ecb-plain -s 128 44.55 45.30
-c fcrypt-pcbc-plain -s 64 58.96 64.30