A tale of lost entropy - Techiavellian (2024)

Recently, whilelooking ata JavaScript functionintended to generate a cryptographically-secure random IV to be used in AES-GCM, I noticed something interesting which I immediately suspected was not unique to this project. Sure enough, Matt, my awesome colleague, sent me a link to a how-to article describing the process of generating random values in Node.js that included the exact same quirk.

Here is their example (with minor edits so as not to call out the author of that how-to post too explicitly):

Do you notice anything fishy?

Perhaps you notice that len/2 and think that’s a little strange. This quirk is described thusly (I can’t help but quote here):

“In hexadecimal format a single byte generates two characters. So there is no point to generate the same number of bytes as number of output characters. We can only generate half of the bytes.”

What happens when you call crypto.randBytes(X)? You get X random bytes in a Buffer. You can easily represent these in hex string format with .toString(‘hex’), as in the example above. So for a call to crypto.randBytes(8).toString(‘hex’)you might get “71d75f6f286f2e96”.

Cool! That looks like 16characters though, and we only wanted 8… Here you might say to yourself, “Aha! I’ll just generate 8 bytes like in the example!”

And if you DID think that, I now have to bore you with some math.

Those familiar with hexadecimal will remember that we have 16 distinct hex characters: 0-9 and A-F. A byte is composed of 2 hex characters, or 8 binary bits. For instance, 0x0F in binary is 00001111 (15; binary padded for emphasis). 0xFF is 11111111 (255), and so on.

When you ask the crypto module to give you 8 random bytes, think of it like getting 8 numbers between 0 and 255. You can think of this as your “character set” for random number generation. These numbers can be represented in hexadecimal form like we saw earlier, or they can be represented in something like UTF-8. Here is what crypto.randBytes(8).toString(‘utf-8’) might look like:o������1

When you generate 8 random bytes, you have256^8 (1.84467441e19) possibilities.What if we convert the bytes to hex? Well, we know hex has 16 possible characters, and since we represent each random byte as 2 hex characters, 8 bytes turns into a 16-character hex string. We can see that there are the same number of possibilities in both cases: 16^16 =1.84467441e19 = 256^8.

Perhaps you can see where I’m going with this now. If we shave off those last 8 characters of the hex string, we’re now looking at 16^8 (=4294967296) possibilities, or the square root of 1.84467441e19. We’ve reduced the total possible entropy considerably.

To illustrate this more clearly, if we were representing the random bytes in binary, and we decided we only wanted 8 binary characters, that would give us only 256possibilities (2^8), or one actual byte of entropy.

You’re probably thinking:

It’s true, how much entropy you needdepends 100% on the context, and this whole discussion may be overkill for some situations, but if you take a look atNIST SP800-38D (Recommendation for Block Cipher Modes of Operation: Galois/Counter Mode (GCM) and GMAC), it says the following:

The IVs in GCM must fulfill the following “uniqueness” requirement: The probability that the authenticated encryption function ever will be invoked with the same IV and the same key on two (or more) distinct sets of input data shall be no greater than 2^-32. Compliance with this requirement is crucial to the security of GCM. Across all instances of the authenticated encryption function with a given key, if even one IV is ever repeated, then the implementation may be vulnerable to …forgery attacks … In practice, this requirement is almost as important as the secrecy of the key.

Funnily enough, this is exactly how likely an 8-character hex string is to be repeated if you assume that the possibilities are handed out 100% randomly, since 2^32 = 16^8! While you might think that your IV composed of 8 random bytes, or 64 bits, would give you somebreathing room on this “uniqueness” requirement, your 8-character hex string does not (and you should be using 96-bit IVsanyway, but whatever). This paper, referenced in the NIST document above,explains severalattacks on GCM, and is a little more pointed about that “uniqueness” requirement:

… if for some reason a small number of IVs are repeated, it is usually expected that beyond this basic attack which reveal the corresponding messages, nothing too bad should happen. With an ordinary counter mode, this is indeed the case. However, with both versions of GCM, allowing a small number of repeated IVs leads to a serious leak where the authentication key is revealed.

To sum it up, the only thing the adversary cannot do is recover the main key K. However, the main goal of cryptography is to protect the security of data not keys.

Probably best to avoid tempting fate here.

Sadly, I’ve since found several examples of this misguided byte-reduction on Github, and it’s pretty much impossible to miss the how-to article referenced at the beginning of this postwhenGoogling on this topic. Again, context is key, but this is a cautionary tale againstmaking too many assumptions in a cryptographically sensitive context.

If you’re convinced that you should let those extra bytes be free, but you don’twant to pass around some arbitraryUTF-8 string in e.g. JSON, my suggestion would be to encode it with base64 to save a littlespace, or to save the full hex string.

If you enjoyed this post,reach out to me on Twitter (@ccneill)and let me know!

  • Category:Entropy, Programming, Security,

Leave a Reply

A tale of lost entropy - Techiavellian (2024)

References

Top Articles
Latest Posts
Article information

Author: Virgilio Hermann JD

Last Updated:

Views: 5801

Rating: 4 / 5 (61 voted)

Reviews: 92% of readers found this page helpful

Author information

Name: Virgilio Hermann JD

Birthday: 1997-12-21

Address: 6946 Schoen Cove, Sipesshire, MO 55944

Phone: +3763365785260

Job: Accounting Engineer

Hobby: Web surfing, Rafting, Dowsing, Stand-up comedy, Ghost hunting, Swimming, Amateur radio

Introduction: My name is Virgilio Hermann JD, I am a fine, gifted, beautiful, encouraging, kind, talented, zealous person who loves writing and wants to share my knowledge and understanding with you.