Integrity

Asymmetric cryptography, key distribution, hybrid cryptosystems, MACs, digital signatures, and certificates

Paul Krzyzanowski

February 10, 2024

Public key cryptography

Public key algorithms, also known as asymmetric ciphers, use one key for encryption and a separate, but related, key for decryption. One of these keys is kept private (known only to the creator) and is known as the private key. The corresponding key is generally made visible to others and is known as the public key.

Content encrypted with the private key can only be decrypted with the corresponding public key. This will be the basis for digital signatures. Anything that is encrypted with a public key can be encrypted only with the corresponding private key. This is the basis for authentication and covert communication.

Although public and private keys are related, because public keys are expected to be shared, it is crucial for security that there is no way to compute a private key from a public key. Public key cryptography is based on trapdoor functions. These are one-way functions: there is no known way to compute the inverse unless you have extra data: the other key.

One example of a trapdoor function is the difficulty of factoring. Suppose you are given the number 10085134437017 and are told that it is the product of two prime numbers. How would you find them? We don’t know of algorithms that would yield an answer, and you would have to perform an exhaustive search. However, if you are told that one of the factors is 3906467 then it becomes trivial to perform a division to find the other factor. For cryptographic security, these numbers would be hundreds of decimal digits long.

RSA public key cryptography

The RSA algorithm was the first public key encryption algorithm. Its security is based on the difficulty of finding the factors of the product of two large prime numbers. Unlike symmetric ciphers, RSA encryption is a matter of performing arithmetic on large numbers. Like symmetric ciphers, it works on a block of data at a time and treats that block as a number. Plaintext is converted to ciphertext by the formula:

c = me mod n

Where m is a block of plaintext, e is the encryption key, and n is an agreed-upon modulus that is the product of two primes. To decrypt the ciphertext, you need the decryption key, d:

m = cd mod n

Given the ciphertext c, e, and n, there is no efficient way to compute the inverse to obtain m. Should an attacker find a way to factor n into its two prime factors, however, the attacker would be able to reconstruct the encryption and decryption keys, e and d.

The math behind RSA - an example

To demystify RSA, here’s the math behind it. We’ll do an example with very small numbers. The results will be completely insecure cryptographically but will more concisely illustrate how the math works.

(1) Select two distinct prime numbers \(p\) and \(q\). Typical numbers are over 300 decimal digits long for a 2048-bit key. Each number is chosen by starting at a random point and searching for the first prime number. The algorithm will typically test to see if the number is divisible by the first few hundred primes and then apply several iterations of the Rabin-Miller Primality Test to get a high confidence that the number really is prime.

  • Let’s choose \(p = 3\) and \(q = 11\).

(2) We multiply the two primes together to get a value \(n = pq\), which will be used as the modulus for both the public and private keys.

  • So, \(n = 3 \times 11 = 33\).

(3) Calculate a value called the totient function \(\phi(n)\), which is \((p-1)(q-1)\)

  • Here, \(\phi(33) = (3-1) \times (11-1) = 2 \times 10 = 20\).

(4) Choose an integer value for the public exponent, \(e\), such that \(1 < e < \phi(n)\), and \(e\) and \(\phi(n)\) are relatively prime, meaning they share no factors other than 1 (i.e., the greatest common divisor of e and phi is one).

  • We choose \(e = 7\).

(5) The last step is to find the secret exponent. This requires applying the extended Euclidean algorithm \(d\) to find a value \(d\) such that \(de \equiv 1 \mod \phi(n)\). In other words, we need \(d\) to be the modular multiplicative inverse of \(e\) modulo \(\phi(n)\).

  • We find a \(d\) such that \(7d \equiv 1 \mod 20\).
  • The calculated value of \(d\) is 3.

(6) Now we have a public-private key pair. We can discard \(\phi(n)\), \(p\), and \(q\). Each key is the exponent and the shared modulus.

  • Public Key: \((e=7, n=33)\)
  • Private Key: \((d=3, n=33)\)

Because our modulus is tiny (33 takes up barely more than 5 bits), we can only show examples of encrypting and decrypting tiny values. Here are a few examples:

Encrypt 18 with the private key: c = me mod n = 187 mod 33 = 6

Decrypt 6 with the public key: d = md mod n = 63 mod 33 = 18

If we encrypt something with the public key, we need to decrypt it with the private key:

Encrypt 29 with the public key: c = md mod n = 293 mod 33 = 2

Decrypt 2 with the private key: m = ce mod n = 27 mod 33 = 29

Elliptic curve cryptography (ECC)

About ten years after RSA was created, mathematicians discovered another trapdoor function that could be used for public key cryptography. About 20 years after that, in 2004, elliptic curve cryptography was introduced. Elliptic curve cryptography (ECC) is an alternative to RSA. It is based on finding points along a prescribed elliptic curve. Elliptic curves are a family of equations that take the form:

y2 = x3 + ax + b mod p

Contrary to its name, elliptic curves have nothing to do with ellipses or conic sections and look like bumpy lines. With elliptic curves, multiplying a point on a given elliptic curve by a number will produce another point on the curve.

All parties need to agree on a prime number that will serve as the modulus for the arithmetic, a specific curve equation, and a base point on the curve (called the generator). For the cryptography to be secure, the curve and generator must be carefully chosen. The U.S. NIST recommends fifteen specific elliptic curves of varying security levels (see FIPS 186–5).

The algorithm then selects a random private key, k that can be any integer within a specific range. It then computes the public key by performing a point multiplication of the key and the generator: P = k * G.

However, given that result, it is difficult to find what number was used. This article walks through the math and an example of ElGamal ECC encryption. Some good descriptions of how ECC works that don’t go too deep into the math are a

The security in ECC rests not on our inability to factor numbers as in RSA but on our inability to compute discrete logarithms in a finite field.

The RSA algorithm is still widely used but ECC has significant advantages:

  • ECC can use far shorter keys for the same degree of security. Security comparable to 256-bit AES encryption requires a 512-bit ECC key but a 15,360-bit RSA key

  • ECC requires less CPU consumption and uses less memory than RSA. It is significantly faster for encryption and decryption than RSA.

  • Generating ECC keys is much faster than RSA. Computing the public key still requires performing the point multiplication and is still slower than symmetric algorithms, where you need only a single key, which is a random number).

On the downside, ECC is 27 years younger than RSA, so it took time for people to get comfortable that it’s been well-vetted and the proper set of cryptographically-hard curves were selected. It also took time for people to get confidence that well-tested libraries were implemented and there’s enough adoption across platforms to enable interoperability.

As a standard, ECC was also tainted because the NSA inserted weaknesses into the ECC random number generator that effectively created a backdoor for decrypting content. This has been fixed, of course. There was also mistrust of the government and the NIST’s choice of the 15 approved elliptic curves and generators. After years of study, ECC is considered the preferred choice over RSA for virtually all applications.

If you are interested, here are a few somewhat easy-to-understand tutorials on ECC:

Quantum computing

Quantum computers are markedly different from conventional computers. Conventional computers store and process information that is represented in bits, with each bit having a distinct value of 0 or 1. Quantum computers use the principles of quantum mechanics, which include superposition and entanglement. Instead of working with bits, quantum computers operate on qubits, which can hold values of “0” and “1” simultaneously via superposition. The superpositions of qubits can be entangled with other objects so that their final outcomes will be mathematically related. A single operation can be carried out on 2n values simultaneously, where n is the number of qubits in the computer.

Some problems can be solved exponentially faster with quantum computers than with conventional computers. In 1994, Peter Shor, a mathematician at MIT, studied the kinds of mathematical problems that could be attacked with a hypothetical quantum computer. The assumptions he made assumed that the quantum state could be maintained as long as needed and that current computers are still too noisy to do this, but he analyzed what could be possible. Unfortunately for the cryptography community, a quantum computer would be able to factor huge numbers exponentially faster than current computers. Shor’s algorithm shows that a suitable quantum computer will be able to find the prime factors of large integers and compute discrete logarithms far more efficiently than is currently possible. Hence, Shor’s algorithm will be able to crack public-key-based systems such as RSA, Elliptic Curve Cryptography, and Diffie-Hellman key exchange.

So far, quantum computers are very much in their infancy, and it is not clear when – or if – large-scale quantum computers that are capable of solving useful problems will be built. The goal is to achieve quantum supremacy: creating a sufficiently powerful quantum computer that can solve problems that current computers cannot solve in any appreciable amount of time. IBM, Google, Amazon, Microsoft, Intel, D-Wave Systems, and others are all working on trying to create useful quantum computers. It is unlikely that they will be built in the next several years, but we do expect that they will be built eventually. Creating one is also a type of arms race between China and the U.S. as possessing this may enable the decryption of current and past content that has been acquired from sniffing network traffic. In 2918, Google claimed to have achieved quantum supremacy, but IBM countered their claim by stating that the problem that was solved could be tackled in 2.5 days on a supercomputer. In 2020, a team at the University of Science and Technology of China (USTC) reached quantum supremacy with a 78-photon photonic quantum computer. The group followed this work with two additional quantum computers, including a 56-qubit version.

In 2016, the NSA called for a migration to “post-quantum cryptographic algorithms”. The goal is to find useful trapdoor functions that do not rely on multiplying large primes, computing exponents, or any other mechanisms that can be attacked by quantum computation. Four quantum-resistant finalists were announced in 2023 and the post-quantum cryptographic standard is expected to be finalized in 2024. Three of the chosen algorithms are based on lattices and one is based on hash functions. If you are interested in these, you can read the NSA’s report.

Symmetric cryptosystems, such as AES, are not particularly vulnerable to quantum computing since they rely on flipping and permuting bits rather than applying mathematical functions to the data. The best potential attacks come via Grover’s algorithm [also this], which yields only a quadratic (\(\sqrt(n)\) vs \(n\) operations) rather than an exponential speedup in key searches. This will reduce the effective strength of a key by a factor of two. For instance, a 128-bit key will have the strength of a 64-bit key on a conventional computer. It is easy enough to use a sufficiently long key (256-bit AES keys are currently recommended) so that quantum computing poses no serious threat to symmetric algorithms.

Secure communication

Symmetric cryptography

Communicating securely with symmetric cryptography is easy. All communicating parties must share the same secret key. Plaintext is encrypted with the secret key to create ciphertext and then transmitted or stored. It can be decrypted by anyone who has the secret key.

Asymmetric cryptography

Communicating securely with asymmetric cryptography is a bit different. Anything encrypted with one key can be decrypted only by the other related key. For Alice to encrypt a message for Bob, she encrypts it with Bob’s public key. Only Bob has the corresponding key that can decrypt the message: Bob’s private key.

Key distribution and hybrid cryptography

The biggest problem in secure communication for thousands of years has been key distribution. How do you provide all trusted parties with keys so that they can encrypt their communications? In earlier times, this could only be solved by sharing keys, codebooks, or cipher devices in person. In later eras, keys would be shared in person or via some trusted channel, such as a telephone call (assuming that provided sufficient security and wiretapping wasn’t a concern). With computer networking, the challenge became more challenging: how can you communicate securely with somebody you’ve never met before and may perhaps not even know?

For Alice and Bob to communicate, they need to share a secret key that no adversaries can get. However, Alice cannot send the key to Bob since it would be visible to adversaries. She cannot encrypt it because Alice and Bob do not share a key yet.

Asymmetric (public key) cryptography alleviates the problem of sending a key over a non-secure communication channel. However, public key cryptography is never used to encrypt large amounts of data for several reasons:

  1. Asymmetric algorithms, especially RSA, are considerably slower than symmetric cryptography. AES, for example, is approximately 1,500 times faster for decryption than RSA and 40 times faster for encryption. AES is also much faster than ECC. In one case you’re flipping and shifting bits. In the other case you’re taking exponents of enormously huge numbers.

  2. Key generation is also considerably slower with RSA or ECC than it is with symmetric algorithms, where there is just a single key and that key is just a random number rather than a set of carefully chosen numbers with specific properties. ECC key generation is much faster than RSA since the private key can be an arbitrary random number within a specific range, but it still requires computing the public key.

  3. Because public key cryptography is based on arithmetic properties, certain arithmetic relationships between plaintext may also be present between the resulting ciphertext, which can give an adversary insight into the content of messages.

  4. For a message to be sent securely, it has to be encrypted with the recipient’s public key. Only the recipient will be able to decrypt it with the corresponding private key. If attackers expect some predictable content, they can encrypt all the content they expect and then compare it with messages from the sender.

Because of these factors, asymmetric algorithms, such as RSA and ECC, are not used to encrypt content. Instead, we use a technique called hybrid cryptography, where a public key algorithm is used to encrypt a randomly generated key that will encrypt the message with a symmetric algorithm. This randomly generated key is called a session keysince it is generally used for one communication session and then discarded.

For Alice to send a key to Bob:

  1. Alice generates a random session key.
  2. She encrypts it with Bob’s public key & sends it to Bob.
  3. Bob decrypts the message using his private key and now has the session key.

Bob is the only one who has Bob’s private key and is thus the only one who can decrypt that message and extract the session key.

Diffie-Hellman key exchange

The Diffie-Hellman key exchange algorithm allows two parties to establish a common key without disclosing any information that would allow any other party to compute the same key. Each party generates a private key and a public key. Despite their name, these are not encryption keys; they are just numbers. Diffie-Hellman is a form of a public key algorithm but does not implement encryption. It enables Alice to compute a common key using her private key and Bob’s public key. Similarly, Bob can compute the same common key by using his private key and Alice’s public key.

Diffie-Hellman uses the one-way function abmod c. Its one-wayness is due to our inability to compute the inverse: a discrete logarithm. Anyone may see Alice and Bob’s public keys but will be unable to compute their common key. Although Diffie-Hellman is not a public key encryption algorithm, it behaves like one in the sense that it allows us to exchange keys without having to use a trusted third party.

Here’s how it works:

Suppose Alice and Bob want to communicate:

  1. All arithmetic performed in a field of integers modulo some large prime number. Alice and Bob agree on a large prime number p and some number α < p.
  2. Alice generates a private key a, which is simply a random number, and a public key, which is derived from the public key, A = αa mod p
  3. Bob generates a private key b, which is also a random number, and a public key, which is derived from his public key, B = αb mod p
  4. They each send their public key to each other.
  5. To compute a common key, one simply takes the remote user’s public key to the power of one’s private key mod p:
    • Alice computes the common key as C = Ba mod p (Bob’s public key to the power of Alice’s private key)
    • Bob computes the common key as C = Ab mod p (Alice’s public key to the power of Bob’s private key)

The magic of the algorithm is that there is no feasible way for an intruder who only sees the public keys, A and B, to be able to compute the common key C.

Forward secrecy

If an attacker steals, for example, Bob’s private key, he will be able to go through old messages or communication sessions and decrypt old session keys (the start of every interaction with Bob contained a session key encrypted with his public key). Forward secrecy, also called perfect forward secrecy, is the use of keys and key exchange protocols where the compromise of a key does not compromise past session keys. There is no secret that one can steal that will allow the attacker to decrypt multiple past messages. Note that this is of value for communication sessions but not stored encrypted documents (such as email). You don’t want an attacker to gain any information from a communication session even if a user’s key is compromised. However, the user needs to be able to decrypt her own documents, so they need to rely on a long-term key.

Achieving forward secrecy requires single-use (ephemeral) public keys. Next time Alice and Bob want to communicate, they will generate a new set of keys and compute a new common key. At no time do we rely on long-term keys, such as Alice’s secret key or an RSA private key. Encrypting a session key with a long-term key, such as Bob’s public key, will not achieve forward secrecy. If an attacker ever finds Bob’s private key, she will be able to extract the session key.

Diffie-Hellman enables forward secrecy. Alice and Bob can each generate a key pair and send their public key to each other. They can then compute a common key that nobody else will know and use that to communicate.

Diffie-Hellman is particularly good for achieving forward secrecy because it is extremely efficient to create new key pairs on the fly. RSA or ECC keys can be used as well, but their key generation is less efficient (tremendously less efficient in the case of RSA). Because of this, RSA and ECC keys tend to be used mainly as long-term keys (e.g., for authentication) and Diffie-Hellman is the dominant technique for creating ephemeral keys that will be used to create single-use (also ephemeral) symmetric session keys.

A newer and more widely used variation of Diffie-Hellman is Elliptic Curve Diffie-Hellman (ECDH). It uses the same elliptic curves as Elliptic Curve Cryptography (ECC). The mechanism is the same as the classic Diffie-Hellman algorithm, only the math is different. ECC uses ECC point multiplication instead of modular exponentiations, although both systems have their security based on our inability to perform discrete logarithms. With ECDH:

  1. Alice and Bob agree on an elliptic curve and a starting point G (the generator, mentioned earlier).
  2. Alice creates a random private key a and a public key A = a * G.
  3. Bob creates a random private key b and a public key B = b * G.
  4. Alice sends her public key to Bob and Bob sends his public key to Alice.
  5. Alice computes a common key C = a * B.
  6. Bob computes a common key C = b * A.

The value C becomes the same shared secret because it relies on the property that (a * G) * b = (b * G) * a, so a * B = b * A = a * b * G. Hence, they have both computed a * b * G but an attacker who only sees A and B will not be able to do so. Elliptic Curve Diffie-Hellman has been shown to be more secure and considerably more efficient computationally than traditional Diffie-Hellman.

Message Integrity

One-way functions

A one-way function is one that can be computed relatively easily in one direction, but there is no known way of computing the inverse function. One-way functions are crucial in a number of cryptographic algorithms, including digital signatures, Diffie-Hellman key exchange, and both RSA and elliptic curve public key cryptography. For Diffie-Hellman and public key cryptography, they ensure that someone cannot generate the corresponding private key when presented with a public key. Key exchange and asymmetric cryptography algorithms rely on a special form of one-way function called a trapdoor function. This is a function whose inverse is computable if you are provided with extra information, such as a private key that corresponds to the public key that was used to generate the data.

Hash functions

A particularly useful form of a one-way function is the cryptographic hash function. This is a one-way function whose output is always a fixed number of bits for any input. Normal hash functions are commonly used in programming to construct hash tables, which provide O(1) lookups of keys in a symbol table. Cryptographic hash functions produce far longer results than those used for hash tables and have more stringent requirements. Common lengths for cryptographic hashes are 224, 256, 384, or 512 bits.

Good cryptographic hash functions (e.g., SHA-2, SHA-3) have several properties:

  1. Like all hash functions, take arbitrary-length input and produce fixed-length output

  2. Also like all hash functions, they are deterministic; they produce the same result each time when given identical input.

  3. They exhibit pre-image resistance, or hiding. Given a hash H, it should not be feasible to find a message M where H=hash(M).

  4. The output of a hash function should not give any information about any of the input. For example, changing a byte in the message should not cause any predictable change in the hash value.

  5. They are collision resistant. While hash collisions can exist, it is not feasible to find any two different messages that hash to the same value. Similarly, it is not feasible to modify the plaintext without changing its resultant hash.

  6. They should be relatively efficient to compute. We would like to use hash functions as message integrity checks and generate them for each message without incurring significant overhead.

Note that collisions can, in theory, exist because the number of possible hashes is smaller than the number of possible messages; see the pigeonhole principle. However, we’re talking about big numbers and incredibly infinitesimal odds. With a 256-bit hash, the chances of another message having the same hash as your message is 1 in 2256, which is .00000000000000000000000000000000000000000000000000000000000000000000000000086%. In other words, you have a 99.99999999999999999999999999999999999999999999999999999999999999999
99999999991% chance that’s not the case!

Because of these properties, we have extremely high assurance that a message that is modified in any manner would no longer hash to the same value as the original.

The cryptographic hash function is the basis for message authentication codes and digital signatures.

The holy grail for an attacker is to be able to construct a meaningful message that hashes to the same value as another message. That would allow the attacker to substitute a new message for some original one (for example, redirecting a money transfer). Searching for a collision with a pre-image (a known message) is much harder than searching for any two messages that produce the same hash. The birthday paradox tells us that the search for a collision of any two messages (rather than matching some specific message) is approximately the square root of the complexity of searching for a collision on a specific message. This means that the strength of a hash function for a brute-force collision attack is approximately half the number of bits of the hash. A 256-bit hash function has a strength of approximately 128 bits.

Popular hash functions include SHA-1 (160 bits), SHA-2 (commonly 256 and 512 bits), and SHA-3 (256 and 512 bits). Google demonstrated the ability to find a collision in SHA-1 in February 2017 but had to run over 9 quintillion SHA-1 computations to perform this attack. At that point, SHA-1 was already being phased out since weaknesses were discovered. Although there is no advantage to using it over SHA-2, for almost any practical purpose, it’s still a secure form of checkup and is still (as of 2024) used in Git and in BitTorrent.

Message Authentication Codes (MACs)

A cryptographic hash helps us ensure message integrity: it serves as a checksum that allows us to determine if a message has been modified. If the message is modified, it no longer hashes to the same value as before. However, if an attacker modifies a message, she may be able to modify the hash value as well. To prevent this, we need a hash that relies on a key for validation. This is a message authentication code, or MAC. A MAC is also called a keyed hash. MACs can be hash-based or block cipher-based:

Hash-based MAC (HMAC)

A hash-based MAC is a specific method, defined in RFC 2104, for converting regular hash functions into MACs by using a cryptographic hash function, such as SHA-256 (the 256-bit version of a SHA-2 hash), to hash the message and the key. Anyone who does not know the key will not be able to recreate the hash. It is computed as:

HMAC(m, k) = H((opadk) || H(ipadk) || m))

where
H = cryptographic hash function
opad = outer padding 0x5c5c5c5c … (01011100…)
ipad = inner padding 0x36363636… (00110110…)
k = secret key
m = message
⊕ = XOR
|| = concatenation

Note the extra hash. The simple form of an HMAC would simply be hash(m, k). The HMAC standard devised this extra hashing and padding to strengthen the HMAC against weaker hash functions. Logically, you can think of an HMAC as simply the hash of the key and message.

Block cipher-based MAC (CBC-MAC)

Recall that cipher block chaining assures us that every encrypted block is a function of all previous blocks. CBC-MAC uses a zero initialization vector and runs through a cipher block chained encryption, discarding all output blocks except for the last one, which becomes the MAC. Any changes to the message will be propagated to that final block and the same encryption cannot be performed by someone without the key. Note that a CBC-MAC still produces a fixed-length result (the block size of the cipher) and has all the required properties of a hash function thanks to the confusion and diffusion provided by the cipher and the cipher block chaining.

Digital signatures

Message authentication codes rely on a shared key. Anybody who possesses the key can modify and re-sign a message. There is no assurance that the action was done by the author of the message. Digital signatures have stronger properties than MACs:

  1. Only you can sign a message but anybody should be able to validate it.
  2. You cannot copy the signature from one message and have it be valid on another message.
  3. An adversary cannot forge a signature, even after inspecting an arbitrary number of signed messages.

Digital signatures require three operations:

  1. Key generation: {private_key, verification_key } := gen_keys(keysize)
  2. Signing: signature := sign(message, private_key)
  3. Validation: isvalid := verify(message, signature, verification_key)

Since we trust hashes to be collision-free, it makes sense to apply the signature to the hash of a message instead of the message itself. This ensures that the signature will be a small, fixed size, and easy to embed in hash pointers and other structures and creates minimal transmission or storage overhead for verification. It also ensures that the message is not encrypted or altered and avoids the weaknesses of using public key algorithms to encrypt large amounts of content.

There are several commonly used digital signature algorithms:

DSA, the Digital Signature Algorithm
The current NIST standard generates key pairs that are secure because of the difficulty of computing discrete logarithms.
ECDSA, Elliptic Curve Digital Signature Algorithm
A variant of DSA that uses elliptic curve cryptography
EdDSA: Edwards-curve Digital Signature Algorithm
Another variant that uses twisted Edwards Curves, which are another family of elliptic curves.
Public key cryptographic algorithms
RSA or Elliptic Curve Cryptography applied to message hashes.

All these algorithms generate public and private key pairs. The first three are not general-purpose encryption algorithms but are designed solely for digital signatures. There are others as well, but these are the most widely used.

We saw how public key cryptography can be used to encrypt messages: Alice encrypts a message using Bob’s public key to ensure that only Bob could decrypt it with his private key. We can use the public key backward: Alice can encrypt a message using her private key. Anyone can decrypt the message using her public key but, in doing so, would know that the message was encrypted by Alice.

A digital signature can be constructed by simply encrypting the hash of a message with the creator’s (signer’s) private key. Alternatively, digital signature algorithms have been created that apply a similar principle: hashing combined with trapdoor functions so that you would use a dedicated set of public/private keys to create and verify the signature. Anyone who has the message signer’s public key can decrypt the hash and thus validate the hash against the message. Other parties cannot recreate the signature.

Note that, with a MAC, the recipient or anyone in possession of the shared key can create the same MAC. With a digital signature, the signature can only be created by the owner of the private key. Unlike MACs, digital signatures provide non-repudiation – proof of identity. Alice cannot claim that she did not create a signature because nobody but Alice has her private key. Also unlike MACs, anyone can validate a signature since public keys are generally freely distributed. as with MACs, digital signatures also provide proof of integrity, assurance that the original message has not been modified.

Covert and authenticated messaging

We ignored the encryption of a message in the preceding discussion; our interest was assuring integrity. However, there are times when we may want to keep the message secret and validate that it has not been modified. Doing this involves sending a signature of the message along with encrypting the message.

Alice can send a signed and encrypted message to Bob by using hybrid cryptography. She will:

  1. Create a signature of the message. This is a hash of the message encrypted with her private key.

  2. Create a session key for encrypting the message. This is a throw-away key that will not be needed beyond the communication session.

  3. Encrypt the message using the session key. She will use a fast symmetric algorithm to encrypt this message.

  4. Package up the session key for Bob: she encrypts it with Bob’s public key. Since only Bob has the corresponding private key, only Bob will be able to decrypt the session key.

  5. She sends Bob the encrypted message, encrypted session key, and signature.

We will later see how, in communication frameworks such as Transport Layer Security (TLS), we will use separate keys for authenticating both sides, encrypting the content, and authenticating the content. Moreover, if a secure communication channel has been established, there is no compelling reason to use digital signatures since MACs can be computed more efficiently and a temporary shared key can be easily established for the session and used for the MAC.

Identities

A signature verification key (e.g., a public key) can be treated as an identity. You possess the corresponding private key and therefore only you can create valid signatures that can be verified with the public key. This identity is anonymous; it is just a bunch of bits. There is nothing that identifies you as the holder of the key. You can simply assert your identity by being the sole person who can generate valid signatures and decrypt content encrypted with the corresponding public key.

Since you can generate an arbitrary number of key pairs, you can create a new identity at any time and create as many different identities as you want. When you no longer need an identity, you can discard your private key for that corresponding public key. However, as long as you have that private key, you can validate that you are the originator of any messages signed with the key and you are the only one who can decrypt content encrypted with that public key.

Identity binding: digital certificates

While public keys provide a mechanism for asserting integrity via digital signatures, they are themselves anonymous. Alice can Bob’s public key but has no confidence in knowing that the key really belongs to Bob and was not planted by an adversary. Some form of identity binding of the public key must be implemented for you to know that you really have my public key instead of someone else’s. How does Alice really know that she has Bob’s public key?

X.509 digital certificates provide a way to do this. A certificate is a data structure that contains user information (called a distinguished name) and the user’s public key. This data structure also contains a signature of the certification authority. The signature is created by taking a hash of the rest of the data in the structure and encrypting it with the private key of the certification authority. The certification authority (CA) is responsible for setting policies on how they validate the identity of the person who presents the public key for encapsulation in a certificate.

To validate a certificate, you would hash all the certificate data except for the signature. Then you would decrypt the signature using the public key of the issuer. If the two values match, then you know that the certificate data has not been modified since it has been signed. The challenge is how to get the public key of the issuer. Public keys are stored in certificates, so the issuer would have a certificate containing its public key. This certificate can be signed by yet another issuer. This process is called certificate chaining. For example, Alice can have a certificate issued by the Rutgers CS Department. The Rutgers CS Department’s certificate may be issued by Rutgers University. Rutgers University’s certificate could be issued by the State of New Jersey Certification Authority, and so on. At the very top level, we will have a certificate that is not signed by any higher-level certification authority. A certification authority that is not underneath any other CA is called a root CA. In practice, this type of chaining is rarely used. More commonly, there are hundreds of autonomous certification authorities acting as root CAs that issue certificates to companies, users, and services. The certificates for many of the trusted root CAs are preloaded into operating systems or, in some cases, browsers. See here for Microsoft’s trusted root certificate participants and here for Apple’s trusted root certificates.

Every certificate has an expiration time (often a year or more in the future). This provides some assurance that even if there is a concerted attack to find a corresponding private key to the public key in the certificate, such a key will not be found until long after the certificate expires. It is also a commitment from the CA that they will be responsible for tracking the validity of the certificate for that interval of time.

A certificate may become invalid because the private key was leaked, the certificate owner may no longer be considered trustworthy (for example, an employee leaves a company), or the certificate was impropely issued.

In this case, a certificate can be revoked. Each CA publishes a certificate revocation list, or CRL, containing lists of certificates that they have previously issued that should no longer be considered valid. To prevent spoofing the CRL, the list is, of course, signed by the CA. Each certificate contains information on where to obtain revocation information.

For example, in July 2024, DigiCert, a well-known CA, announced that it will revoke over 83,000 certificates because of errors in how it validated the owner of the domain within the certificate.

The challenge with CRLs is timing: not everyone may check the certificate revocation list in a timely manner and some systems may accept a certificate not knowing that it was revoked. Some systems, particularly embedded systems, may not even be configured to handle CRLs.

Code Signing

Code signing is a security practice that applies digital signatures to software, enabling the verification of the software’s authenticity and integrity. It ensures that the software has not been altered or compromised since it was signed.

The primary goals of code signing are to:

  1. Ensure Integrity: Code signing confirms that the software has not been altered or corrupted since it was signed. This is crucial in protecting users from downloading malware-infected or unauthorized modifications of software.

  2. Verify Authenticity: It allows users and systems to verify the identity of the software publisher. This is important for building trust, especially in environments where software is downloaded from the internet.

  3. Non-Repudiation: By signing code, developers and organizations can be uniquely linked to their software, making it difficult for them to deny publishing it. This accountability is vital for both security and legal reasons.

Mechanisms of Code Signing

Code signing uses the following mechanisms:

  1. Key Pair Generation: The software publisher generates a public/private key pair. The private key is kept secure and used for signing the software, while the public key is distributed in the form of a digital certificate for verification.

  2. Hashing: The software to be signed is first hashed to create a small, fixed-length message that is unique to the software.

  3. Signing: The hash is then encrypted with the publisher’s private key, creating the digital signature. This process also often involves the addition of a timestamp to verify when the software was signed.

  4. Verification: When a user downloads the software, their system uses the publisher’s public key to decrypt the signature, re-computes the software’s hash, and compares it to the decrypted hash. If they match, it verifies that the software hasn’t changed since it was signed and confirms the publisher’s identity.

  5. Certificate Authorities (CAs): To establish trust, code signing uses public keys packaged in certificates that are signed by approved CAs. These trusted entities issue digital certificates that validate the publisher’s identity and link them to their public key. When verifying signatures, the presence of a certificate from a recognized CA enables the recipient to validate the origin and authenticity of the public key before using the key to validate the signature on the software..

Implementation Examples

Platforms like Microsoft Windows, Apple macOS, Android, and iOS all implement code signing. This is a very high-level overview (you don’t need to know the distinctions since the general mechanisms for signing and verifying are essentially the same).

Microsoft Windows

Microsoft Windows uses a framework named Authenticode to sign and verify operating system drivers and applications.

  • Certificate acquisition: Obtain a code signing certificate from a trusted Certificate Authority (CA) or use a self-signed certificate. Trusted certificates must be used for operating system drivers.
  • Signing software: Use tools like Microsoft SignTool to sign software binaries (.exe, .dll, etc.) with your private key.
  • Verification: Windows checks the digital signature to ensure software integrity during installation.

Apple macOS

Apple code signing requires developers to sign apps with certificates issued through the Apple Developer Program for their operating system platforms, with an additional notarization process for macOS apps distributed outside the Mac App Store.

  • Certificate acquisition: Obtain a Developer ID certificate by enrolling in the Apple Developer Program. Apple validates their real-word identity before issuing a digital certificate. The certificate allows developers to sign apps and submit them to the App Store for distribution.
  • Signing software: Developers sign applications using Xcode or codesign to attach a digital signature.
  • Verification: Apple reviews and verifies submitted apps prior to distribution in their App Stores. macOS’s Gatekeeper framework verifies the signature to ensure the app comes from an identified developer and has not been altered.

Android

  • Certificates: Android apps do not require using a public CA code signing certificate is not required for signing the APK files. Moreover, they require the use of a certificate that has a validity period of at least 25 years, which is something that public Certification Authorities do not offer. Instead, apps can use a self–signed certificate with minimum of 25 years validity period. Users generate a unique key using keytool and sign the Android application package (APK) with it.
  • Signing software: Sign the APK using jarsigner or Android Studio. The developer can opt into Play App Signing. In this case, developers upload their app signing private key to Google and Google performs the app signing before making the app available.
  • Verification: Android verifies the app’s signature for integrity during installation.

Linux, of course, also supports code signing. Due to the varieties of distributions, different mechanisms exist. Debian distributions, including Ubuntu, use GPG to sign software packages. The APT package manager automatically verifies these signatures when installing packages. Red Hat-based distributions use RPM package manager, which also supports package signing. Similar to APT, YUM and DNF automatically verify the signatures of RPM packages before installation. Tools for signing include gpg, signify, minisign, and openssl.

Runtime Integrity checking

In addition to validating the signature of an entire application, some operating systems support the addition of hashes for individual memory pages of the code. This allows performing runtime integrity checks when a page of code is loaded into memory via the operating system’s demand paging mechanism. As far as I can tell, simple hashes instead of signatures are used for this to ensure high performance. Both Windows and macOS support per-page hashes.

This approach is critical for environments requiring high security, ensuring that even minor alterations are detected and prevented from execution.

References

Last modified July 31, 2024.
recycled pixels