Halite/Doctrine vs MySQL AES_ENCRYPT: security/per

2019-08-25 10:00发布

问题:

I have a web application (Symfony 4) that needs to be HIPAA compliant which (among other things) means I need to encrypt data. Originally I was just going to encrypt data in PHP via Halite and save it in the database, however there are some fields (last name, first name, phone number) I can't encrypt because they will be used for a search field and therefore I need (?) MySQL to be able to use where clauses.

For this reason I was going to use AES_ENCRYPT and set the MySQL connection to go over ssh through a local port forwarded tunnel so that the connection would be secure and no one would be able to get the passphrase.

I keep seeing articles though that AES_ENCRYPT is a bad idea and that the stuff should just be secured in PHP. If I do that, I would need to pull ALL the records down, decrypt them, then have PHP search then - surely not as fast as MySQL could do it (?). This table could have thousands of entries.

Are there any suggestions for this? Am I overthinking it? What risks would there be in doing it via MySQL if I did connections through ssh?

There is so many suggestions on the internet it's hard to know whats right =/ Thanks so much in advance!

回答1:

If you're happy to use exact lookups, you can implement a lookup on an encrypted field at no performance cost. You should definitely implement the encryption logic in your PHP code.

If you currently have your table, say:

(id, first_name, last_name, email)

We first add additional columns for the fields you want encrypted, so our table becomes

(id, first_name, first_name_lu, last_name, last_name_lu, email)

When we update or insert a row, we do 2 things:

  • Using the first symmetric key, we encrypt the required fields. This result goes in the original column.
  • Using the second symmetric key, we HMAC the required field. This result goes in the *_lu column.

When we want to perform a lookup, we:

  • Using the second symmetric key, HMAC the search query and then lookup the *_lu column based on the result.
  • If we find a match, then the encrypted value in the original column is the value we searched for.

You might wonder why the HMAC is necessary at all, why can't we just re-encrypt and compare? We could do that, but it would also mean that we have to use ECB mode to encrypt, which is a large security vulnerability. GCM or CBC should be used instead. This is what makes the HMAC necessary.