-----------------------------------------------------------------------
                          PEFS Security Audit
                             Taylor Hornby
                           February 07, 2014
-----------------------------------------------------------------------

1. Introduction

   This report documents the results of a 13-hour security audit on
   Private Encrypted File System (PEFS). The audit uncovered several
   minor problems, some of which may compromise confidentiality.

1.1. What is PEFS?

   PEFS [1] is an encrypted filesystem similar to EncFS and eCryptFS.
   PEFS is unlike block-level encryption (e.g. TrueCrypt) in that the
   ciphertext mirrors the directory structure of the plaintext.

1.2. Audit Scope

   The audit was performed by reading the PEFS source code [2]. The git
   commit hash of the code that was audited is:

        fb7c4a188cfa7ba69987a1a61104c56f63db5395

   This audit focused primarily on PEFS's cryptographic design and
   implementation. This includes PEFS's use of crypto primitives, but
   not the implementations of the primitives themselves (i.e. the AES
   implementation). Some integer overflow issues were discovered, but
   looking for memory corruption vulnerabilities was not a priority of
   the audit. Bugs in the file system implementation (pefs_vnops.c) were
   also out of scope.

2. Issues

   This section covers all of the security issues discovered during the
   audit. They are each given a rating in three variables:

   Exploitability: How hard is it for an attacker to exploit?
   Security Impact: How bad is it if an attacker can exploit it?
   Confirmation: Has the vulnerability been confirmed to exist?

2.1. HMAC and Other Secrets Not Compared in Constant Time

   Exploitability: Low
   Security Impact: Low
   Confirmation: Confirmed by inspecting the code.

   On line 357 of pefs_key.c, memcmp() is used to verify an HMAC. This
   introduces a timing attack that an attacker might be able to use to
   compute the HMAC of an arbitrary message. This is the HMAC used when
   encrypting child keys for the key chains feature, not the VMAC (which
   is compared in constant time).

   There are several other non-constant-time comparisons. I am not sure
   if they are relevant to security:

     - The comparison of ptk_tweak in pefs_tkey_cmp() in pefs_vnops.c.

     - Comparison of what might be cleartext filenames in
       pefs_enccn_parsedir() in pefs_vnops.c.

     - The comparisons of pk_keyid in pefs_crypto.c, pefs_ctl.c, and
       pefs_keychain.c.

   In general, when secrets are compared for equality, it should be done
   in constant time so that the running time does not leak information
   about their actual values.

2.2. Same Key Used for HMAC and Encryption

   Exploitability: Very Low
   Security Impact: Low
   Confirmation: Confirmed by inspecting the code.

   This issue affects the key chain system, which is an optional PEFS
   feature. If the key chains feature is not used, the file encryption
   is unaffected.

   The same key is used for encryption and authentication in pefs_key.c.
   This happens in the pefs_key_cipher() procedure. The variable 'key'
   is created by HMACing a distinguisher string with another key called
   pxk_key. The resulting 'key' is used twice: Once to encrypt the data,
   and a second time to HMAC the ciphertext.

   This does not create immediate problems, but using the same key for
   multiple purposes is a bad practice. Instead, use HKDF to generate
   two separate keys for the two different purposes.

2.3. Zero IV Used with CTR Mode in pefs_key.c

   Exploitability: Medium (Depends on the use case.)
   Security Impact: High
   Confirmation: Confirmed by inspecting the code.

   This issue affects the key chain system, which is an optional PEFS
   feature. If the key chains feature is not used, the file encryption
   is unaffected.

   The pefs_key_cipher() procedure in pefs_key.c encrypts data using AES
   in CTR mode with a zero IV. This means that if multiple messages are
   encrypted with the same key, they are XORed with the same keystream.

   Here's an example of how this can be exploited to reveal the context
   of another user's files:

     The key chain graph looks like:

        A -> Z
        A -> S
        B -> S

     The 'Z' key is encrypted with a parent key that only Alice knows.
     The 'S' key is encrypted with Alice's parent key as well. Bob also
     has access to 'S'.

   In this scenario, Bob knows S, so he can XOR the known value of
   S with Alice's encryption of S to get the key stream generated by
   Alice's parent key. Bob can then XOR this with Alice's Z ciphertext
   to get Z. If Bob did not know S, he could learn Z XOR S by XORing the
   two ciphertexts together.

   This is a problem whenever a parent key encrypts two or more children
   keys. Instead of using a NULL IV, a random IV should be generated at
   encryption time and stored in the ciphertext (remember to HMAC the IV
   too!).

2.4. No Salt Used When Hashing Password

   Exploitability: Medium
   Security Impact: Medium
   Confirmation: Confirmed by inspecting the code.

   When deriving a key from a password, PEFS does not use salt. This is
   necessary, otherwise an attacker can use pre-computation attacks
   (lookup tables, rainbow tables, etc.) to significantly speed up the
   process of cracking PEFS-encrypted directories with weak passwords.

   Not using a salt was an explicit design decision to avoid having to
   store metadata. I recommend adding an (optional?) mode with metadata
   to support having a salt, so that if it fits the user's use case,
   they can benefit from the additional security.

2.5. Ambiguous Filename Encryption Padding

   Exploitability: Very Low
   Security Impact: Very Low
   Confirmation: Confirmed by inspecting the source code.

   The filename encryption routine pads the plaintext with null bytes.
   This is fine, since file names can't contain null bytes in UNIX, but
   if this code is reused for something else, it could become a problem.
   Pad the plaintext unambiguously by using a padding mode like PKCS#7.

2.6. Filename Encryption is Not Randomized When Files are Renamed

   Exploitability: Medium (Depends on how the user renames files.)
   Security Impact: Low
   Confirmation: Confirmed by inspecting the source code.

   PEFS uses a random 64-bit value, called the tweak, to differentiate
   between encrypted files. This value is actually re-used for three
   things. It's used as part of the XTS mode tweak, to "randomize" the
   filename encryption, and as the VMAC IV (after being encrypted).
   
   For the filename encryption, the tweak value is used as a substitute
   for an IV. Filenames are encrypted by first prefixing the tweak, then
   encrypting them in CBC mode with a zero IV.

   This is a problem, because the tweak does not change when files are
   renamed. With CBC mode, the IV has to be chosen randomly at the time
   of encryption, or it isn't IND-CPA secure. If the filename is
   changed, the new ciphertext will be identical to the old ciphertext
   up to the block that contains the first difference. See the third
   part of this image for an awesome visual demonstration:

     https://pbs.twimg.com/media/BekWHdVCYAAgsSm.jpg:large

   There are plans to solve this problem by encrypting filenames with
   wide-block encryption like EME2. This should be an adequate solution,
   but it needs more analysis.

   There may be other problems with the tweak design as well. More
   analysis is recommended in Section 5.3

2.7. Part of Files Encrypted with CTR Mode

   Exploitability: High
   Security Impact: Medium-High
   Confirmation: Confirmed by inspecting the code.

   If a file's size in bytes is not evenly divisible by 16, the last
   block of bytes (up to 15 bytes) is encrypted with CTR mode. If the
   last block is ever changed, and an attacker has the ciphertexts from
   before and after the change, they can XOR both ciphertexts together
   to learn the XOR of the old and new plaintexts. If the attacker knows
   (or can guess) the old plaintext, they can get the new plaintext.

   The relevant code is in xts_smallblock() in pefs_xts.c.

   The only good remediation for this (that I can think of) involves
   increasing the file size so that it is divisible by 16 bytes. I think
   the severity vulnerability justifies the increase in complexity.

2.8. gf_mul128() Does Not Execute In Constant Time

   Exploitability: Low
   Security Impact: Low
   Confirmation: Confirmed by inspecting the source code.

   The gf_mul128() procedure, part of the XTS implementation in
   pefs_xts.c, branches on a value related to its input, which could
   enable side channel attacks:

        static __inline void
        gf_mul128(uint64_t *dst, const uint64_t *src)
        {
            static const uint8_t gf_128_fdbk = 0x87;
            int carry;

            carry = shl128(dst, src);
            if (carry != 0)  // <-- time difference could leak 'carry'
                ((uint8_t *)dst)[0] ^= gf_128_fdbk;
        }

   I don't know if this leaks enough information to break the encryption
   (very probably not), but it's better to be safe and do it in constant
   time anyway. A constant time conditional XOR is given in [3].

2.9. Memory Corruption / Integer Overflows

   Exploitability: Unknown
   Security Impact: High (If exploitable)
   Confirmation: Not confirmed.

   I noticed several bits of code with integer overflow and
   signed/unsigned problems. I didn't have time to check if any of these
   are exploitable, or are even values controlled by the attacker, but
   I will list them here to err on the side of caution:

     - In pefs_name_encrypt(), the "+ 1" in the file name size check
       could overflow the integer and make the check pass even if
       the filename is too long:

         r = PEFS_NAME_NTOP_SIZE(pefs_name_padsize(size)) + 1;
         if (r > MAXNAMLEN) {
             return (-ENAMETOOLONG);
         }

     - In pefs_enccn_set(), the variable encname_len is a *signed*
       integer. It is checked to be less than MAXPATHLEN then passed to
       memcpy(). When it gets passed to memcpy(), it is cast to
       a size_t, which is unsigned. If encname_len has a negative value,
       it will pass the first check, then memcpy will interpret it as
       a large unsigned value:

             if (encname_len >= MAXPATHLEN)
            panic("pefs_enccn_set: invalid encrypted name length: %d",
                encname_len);
         // ...
         memcpy(pec->pec_buf, encname, encname_len);
         ((char *) pec->pec_buf)[encname_len] = '\0';
         pec->pec_cn.cn_namelen = encname_len;

    - Several possibilities for integer overflows in pefs_xbase64.c:

             if (datalength + 1 > targsize)
                return (-1);
         // ...
         target[tarindex]   |=  (pos - Base64) >> 4;
         if ((size_t)tarindex + 1 < targsize)
             target[tarindex+1] =
                 ((pos - Base64) & 0x0f) << 4 ;
         // ...
         target[tarindex]   |=  (pos - Base64) >> 2;
         if ((size_t)tarindex + 1 < targsize)
                     target[tarindex+1] =
                    ((pos - Base64) & 0x03) << 6;

    - Another in pefs_write_int() that could cause the
      vnode_pager_setsize() call to be skipped, which might lead to
      memory corruption down the road:

             nsize = fsize;
             MPASS(uio->uio_offset <= fsize);
             if (uio->uio_offset + uio->uio_resid > nsize) {
                PEFSDEBUG("pefs_write: extend: 0x%jx (old size: 0x%jx)\n",
                    uio->uio_offset + uio->uio_resid, nsize);
                nsize = uio->uio_offset + uio->uio_resid;
                vnode_pager_setsize(vp, nsize);
             }

3. Commendations

   PEFS does a lot of things right. It uses standard constructions like
   XTS mode, PBKDF2, and HKDF. It also diligently zeroes buffers that
   once contained sensitive information. This makes auditing easier.

   Correction 12/05/2014: XTS mode is probably not the ideal option, see
   Thomas Ptacek's blog post for good reasons why:

       http://sockpuppet.org/blog/2014/04/30/you-dont-want-xts/

4. Recommendations

   There are some things that PEFS could do better:

     - Instead of re-implementing CBC mode in pefs_name_enccbc(), it
       would be better to use a well-tested implementation from a crypto
       library.

     - Use test vectors in unit tests and runtime tests to make sure the
       crypto algorithms (XTS, HKDF, PBKDF2, AES, SHA, etc.) are
       correct.

     - Make pefs_hkdf_expand() take a uint8_t instead of an int for the
       'idx' parameter, so that there is a compiler warning when the
       function is used incorrectly. Or check that 'idx' is between
       0 and 255.

5. Future Work

5.1. Memory Corruption Vulnerabilities

   This audit did not focus on "classic" memory corruptions
   vulnerabilities. Because of the integer overflow issues documented in
   Issue 2.9, I think PEFS could benefit from an audit that specifically
   focuses on these concerns.

5.2. The Tweak's Triple Burden

   The 64-bit random tweak is re-used for three different purposes:

        1. It acts like an IV for the filename encryption.
        2. It is part of the XTS mode tweak.
        3. (After being Encrypted) The VMAC IV.

   There was not enough time in the audit to determine whether any of
   these three uses interact in negative ways that would, for example,
   lead to at least one of filename encryption, file contents
   encryption, or the VMAC being broken. More analysis is needed.

   The VMAC's IV appears to be taken from its input, which probably
   causes problems. This issue was not investigated in this audit since
   the VMAC is only used as a non-cryptographic checksum, and should not
   be relied on for security.

6. Conclusion

   This audit found several issues, the most significant of which are
   issues 2.3, 2.6, and 2.7. The possible existence of memory corruption
   bugs (Issue 2.9) is worrisome as well, since this code runs in kernel
   space.

   PEFS would benefit from more auditing time.

7. References

   [1] https://wiki.freebsd.org/PEFS

   [2] https://github.com/glk/pefs

   [3] http://www.iacr.org/archive/ches2009/57470001/57470001.pdf