Confidential virtual machines (CVMs) are becoming increasingly ubiquitous. It is possible to provision an AMD SEV-SNP or an Intel TDX CVM from major cloud providers in a matter of minutes. QEMU/KVM support for the technologies has also significantly matured, making it possible to deploy an on-premise CVM setup. The main focus of these technologies is to provide data confidentiality guarantees for the VM at runtime, but it is equally important to consider the confidentiality of data at rest (e.g., in the situation where the VM gets restarted) migrates to another host or simply when the VM is required to be stateful. Red Hat Enterprise Linux and other Linux distributions has long provided tools for data protection at rest. However, we believe that CVMs drastically change the attacker's profile, so the guarantees provided by the traditional tools such as dm-crypt/LUKS should be re-evaluated.
Assumptions
This article operates on a few core assumptions. The attacker does not have access to the VM’s memory and/or CPU register state because such attacks are presumably mitigated by CVM technologies (e.g., AMD SEV-SNP and Intel TDX). The attacker may have read/write access to the storage for prolonged periods when the VM is actively running and when the VM is turned off. It may also obtain access to the VM's storage before the VM runs for the first time.
When using a stateful vTPM, the attacker does not possess vTPM seeds and thus cannot recover private keys. The attacker, however, has the same level of access to the vTPM as the guest operating system. For example, the attacker can replace the VM’s root storage volume with a specially crafted guest OS version which allows it to perform the required vTPM operations. In particular, the attacker can use the vTPM to create private keys with specific parameters and extract the public key parts.
Possible attack scenarios
The shift to CVMs fundamentally alters the threat model for data at rest. The underlying storage remains vulnerable to a privileged, malicious host. This section explores several hypothetical attack scenarios where an attacker with read/write access to the CVM's storage attempts to breach data confidentiality, integrity, or availability, despite the use of traditional Linux disk encryption tools like dm-crypt/LUKS.
Direct breaking of the confidentiality
The most basic scenario is that the attacker may try to get direct access to the data from the storage snapshot. This attack scenario is supposed to be mitigated by using standard Linux LUKS/dm-crypt disk encryption. However, the encryption parameters affect the required complexity to perform a brute-force attack.
In particular, the default values for cryptsetup(8) are as follows:
Data segments:
0: crypt
cipher: aes-xts-plain64
sector: 512 [bytes]
Keyslots:
1: luks2
Key: 512 bits
Cipher: aes-xts-plain64
Cipher key: 512 bits
PBKDF: pbkdf2
Hash: sha512
Iterations: 1000
AF stripes: 4000
AF hash: sha512Currently, the brute-force attack on a single snapshot does not seem to be computationally feasible. In a CVM scenario, however, the attacker may get multiple snapshots (ciphertexts) of the data. The attacker can observe and snapshot each write to the same sector and in certain cases where no storage placement randomization is done, even make a connection between the ciphertext and the cleartext. The resistance of existing ciphers, such as XTS, to such attacks requires further research.
Stealing the encryption key
LUKS scheme provides robust encryption only when the master key remains secret. CVM technologies provide reliable guarantees that the key cannot be extracted from VMs memory. However, the attacker may try to impersonate the environment, where the key is being created. In the scenario when the VM’s root volume is pre-encrypted before the VM’s first execution; or in the scenario where the VM encrypts itself on the first boot, the attacker may pre-create the encrypted volume and provide it to the VM.
Note, SRK public of the target VM is not a secret, and the attacker can extract it by running a specially crafted guest OS. The attack script is as follows:
- The attacker extracts SRK public from the guest vTPM by running a specially crafted guest OS instead of the attack target.
- The attacker takes the guest OS image (which can be publicly available), encrypts it, and seals the key to the vTPM by using the extracted SRK public. The attacker also records the generated LUKS master key.
- The attacker may modify the guest OS image and inject a backdoor (optionally). This makes the attack possible even if the guest decides to re-encrypt the whole disk with a new, randomly generated master key.
- The attacker launches the VM and can decrypt storage at any time.
Potential mitigations:
- For dedicated pre-encrypting environments: The environment which creates the LUKS key can use a dedicated key to sign a derivation of the key and, possibly, some facts about the source image, the encrypting environment and so on. The signed evidence can be either put to the encrypted volume or to some other place on the storage (LUKS header and ESP) as the signature makes it unable to fake.
- For self-encryption: The storage encryption key must be created in a non-interactive environment (where even the owner cannot perform arbitrary actions) and the evidence of it must be preserved. In particular, the evidence must prove that the key was generated in a CVM and the fact that the environment was non-interactive (e.g., initramfs boot state). The repart: preserve the evidence of the encrypting environment · Issue #40410 proposal for systemd upstream describes the idea.
Unlocking the volume in unsafe environment
When the volume key is sealed to the vTPM, the attacker may try to unseal it. There are two main ways how the attack can be mounted: brute force attack on the sealed object and creating the environment where the vTPM automatically unseals the secret. With standard Linux tooling, (e.g., systemd-cryptenroll(1)), the sealed object resides in the LUKS JSON header and thus is visible to anyone who has read access to the volume. The object, however, should be well protected. For example, the SRK key in Microsoft Azure CVMs today is a 2048-bit RSA key and brute-force attacks against 2048 RSA are not practically feasible yet. Such attacks, however, may become realistic with quantum computers development. Creating the environment where vTPM is going to unseal the object automatically seems more realistic. Normally, the secret is sealed to the TPM using PCR policy so unlocking would require an attacker controlled environment where PCR values match.
The following PCR sets seem to be vulnerable when used with generalized OS images where the attacker can provision publicly accessible OS image with their own access credentials:
PCR4, PCR7, PCR4+7: All instances of the same OS image will have exactly the same values when SecureBoot configuration is identical. The attack is then fairly trivial: the attacker creates a new root OS disk and launches it with guest’s vTPM. In this environment, the vTPM will be able to automatically unlock the sealed key as PCR values match.
Currently, there is a feature proposal to introduce a barrier to PCR7 between initramfs and interactive system to complicate this attack scenario. Namely, in case the secret is sealed to the initramfs time PCR7 value, the attacker will have to use a controlled initramfs environment to perform the unlocking. This should not be possible due to its non-interactiveness.
PCR11, PCR4+11, PCR7+11, PCR4+7+11: All instances of the same OS image will have exactly the same values when SecureBoot configuration is identical. The difference with PCR7 only set is that PCR11 already has barriers between boot stages already (see the systemd measurements) and thus the attacker will similarly have to create a controlled initramfs environment to perform unsealing.
Possible mitigations include:
- UKI must be used. With traditional kernel layout, initramfs is not part of PCR4/PCR7/PCR11 measurements and thus it is trivial for the attacker to perform unsealing action with PCR values observed at initramfs boot stage.
- UKI’s initramfs must be hardened to prevent interactive access. Attackers should not be able to perform any interactive actions with PCR values observed at initramfs boot stage. In particular, UKI’s initramfs should not have an emergency console or similar features.
- Boot stage barriers are needed. For PCR11, this is already done by systemd-pcrphase.service. For PCR7, the upstream feature needs to be finalized.
- Initramfs system and configuration extensions must be vetted. Systemd-stub based UKIs enable initramfs customization with systemd system and configuration extensions. While only properly signed extensions are loaded by default, these extensions are not PE binaries and thus do not affect PCR4/PCR7 measurements. An attacker may try to use shim’s MOK feature in combination with MOK-signed system or configuration extension to perform secret unsealing with initramfs time PCR values. To prevent this attack, PCR12(system extensions)/PCR13(configuration extensions)/PCR14(MOK) need to be added to the set of the PCRs used for secret sealing.
Encrypted data (non-root) disks which are mounted from the interactive system must be sealed to the guest identity to prevent secret unlocking if the disk is connected to a different guest which is under the attacker's control. PCR15 can provide a reasonable assurance when root volume is encrypted. The systemd-pcrfs@.service service measures LUKS master key derivation to PCR15 when root volume is successfully unlocked. Note that machine-id, which also gets measured to PCR15 cannot be considered a reliable identification as the attacker can always change its own /etc/machine-id to match.
Intentional data corruption
LUKS and dm-crypt provide robust protection against direct confidentiality breakage attempts on the data snapshot. The attacker, however, may try to actively write to certain disk locations either before the guest is booted or even at runtime. By default, dm-crypt does not include any integrity/authenticity checks and while the attacker may not be able to write the desired cleartext data, it may cause undetectable corruptions and crashes.
The cryptsetup(8) Linux utility supports enabling authenticated disk encryption since v2.0.0, however, the support for the feature is declared as EXPERIMENTAL and thus not used by default in various Linux distributions. Note, when enabled, authenticated disk encryption in cryptsetup(8) uses ‘dm-integrity’ as a storage layer for storing authentication tags and authentication tags need to be written separately upon each sector update. This may have a significant impact on the storage performance.
Replay attacks
Similarly to inducing data corruption, the attacker may try to restore specific locations of the storage to their previously recorded values by restoring legitimate ciphertext. By default, cryptsetup(8) tool in Linux uses aes-xts-plain64 cipher which gives the attacker an ability to restore 16-bytes encrypted blocks. Using authenticated disk encryption in cryptsetup(8) does not mitigate the attack as authentication tags can also be restored. Using the integrity checking mechanism on top of LUKS devices (dm-integrity and fs-integrity) has similar drawbacks. The attack can be mounted against both the root volume and data volumes. It is also amplified by the fact that LUKS/dm-crypt do not perform any storage placement randomization: in case the attacker has access to a similar guest (e.g., another guest provisioned from the same publicly available OS image), we can easily guess the placement of critical data.
Currently, there doesn’t seem to be a good strategy and appropriate mechanisms to mitigate the attack in Linux.
Side-channel attacks
In addition to the replay attack, the attacker may try to gather additional information about the guest by merely observing storage access. Without storage placement randomization in LUKS/dm-crypt, the placement of certain critical pieces of the data can easily be guessed. With the default cipher used by cryptsetup(8) (aes-xts-plain64) writing identical data always leads to the identical ciphertext and this fact can also be observed by the attacker.
Similarly to the replay attacks, there doesn’t seem to be a good strategy and appropriate mechanisms to mitigate the attack in Linux today.
Final thoughts
In this article, we discussed several possible attack scenarios against CVM’s storage volumes that a malicious host could potentially exploit. The list of the attacks and possible mitigations is certainly incomplete. We believe we have only scratched the surface. Let us know your thoughts.