Friday, July 27, 2018

Testing Encryption - 3 years of Dan Boneh's Online Cryptography Course

Three years ago in July, I completed Dan Boneh's online cryptography course with distinction through Coursera's Cryptography 1.  Since then, I've had the opportunity to use and test cryptographic systems at work and for hobbies.  Here are a few lessons learned when testing encryption.

I have found my fair share of bugs in the crypto we chose to use at work.  I've gotten into a routine when testing encryption used for message authentication:
  • Test the same plaintext multiple times.  Does it need to be different each time?  How much of the MAC is different each time?  It might help to explore the data your hashing function spits out as it can tell you how your hash function does what it does.
  • Replay it.  How can a user abuse identical MAC'd data if they replay it at a later date?  For a different user?  Can you add items to the plaintext that will allow you to validate not only the data but the source or timeframe as well?
  • Ensure your hashes are detecting changes. Is your MAC rejected if you change the data at various places within the message?
  • Rotate the key. Do you need a hash to survive a key change?  Usually you can just regenerate the data and re-MAC it, so figure out if you really need to use MACs over long lifetimes.  They're easy to compute.
  • Generate a bunch at once.  Is performance an issue with the service?  Most hashes are built for speed, but is yours?
For each of these failure modes, I'm looking mostly for hints of weakness.  I'm expecting pseudo-random noise, but how does my brain distinguish that from almost random noise?

There are many times when you need to generate a unique but random value but don't have the space to use a GUID.  To evaluate if a solution will be "unique enough", check out the Birthday problem wikipedia page, and this table of probabilities in particular.  Find out how many possible values exist (9 numeric digits = 10^9 ~= 2^30).  Compare on the table with that value as the hash space size versus the number of times you'll be setting this value.  This will tell you if the algorithm you want to use is sufficient.  If you are making long-term IDs that can only be created once, you obviously  want the probability of collision to be extremely low.  If you can recover from a collision by creating a new transaction fairly readily, you might not need as much assurance.  Ive used this to help drive a decision to increase unique token size from 13 to 40 characters, guide switching from SQL auto-numbers to random digits to hide transaction volumes, and ensure internal transaction IDs are unique enough to guide troubleshooting and reporting.

Time and again, the past three years have taught me that cryptography must be easy for it to be used widely.  I've stayed with Signal for text messaging because it just works.  I can invite friends and not be embarrassed at its user interface.  It doesn't tick all the boxes (anonymity is an issue being a centralized solution), but it has enough features to be useful and few shortcomings.  This is the key to widespread adoption of encryption for securing communications.  Since Snowden revealed the extent of the NSA's data collection capability, sites everywhere have switched on HTTPS through Let's Encrypt. Learning more about each implementation of SSH and TLS in the course was both informative and daunting. I was anxious to get HTTPS enabled without rehosting the site on my own.  Early 2018, Blogger added the ability to do just that through Let's Encrypt.  It requires zero configuration once I toggle it on.  I can't sing its praises enough.  The content of this blog isn't exactly revolutionary, but this little move toward a private and authentic web helps us all.

Dan Boneh's Cryptography course continues to inform my testing.  The core lesson still applies: "Never roll your own cryptography."  And the second is how fragile these constructs are.  Randomness is only random enough given the time constraints.  Secure is only secure enough for this defined application.  Every proof in the course is only as good as our understanding of the math, and every implementation is vulnerable at the hardware, software, and user layers.  In spite of this, it continues to work because we test it and prove it hasn't broken yet.  I'm looking forward to another three years of picking it apart.
Post a Comment