Distributed Systems Security
Authentication
Paul Krzyzanowski
April 20, 2021
Goal: Create protocols for authenticating users, establishing secure communication sessions, authorizing services, and passing identities.
Authentication exists to establish and verify the identity of a user (or a service, process, or server). Once authentication is complete, a process can decide whether to allow access to the service or its resources and what type of access is permitted. This is called authorization.
The three factors of authentication are: something you have (such as a key or a card), something you know (such as a password or PIN), and something you are (biometrics).
Each of these factors has pitfalls and can be stolen. Someone can take your access card or see your password. Biometrics are more insidious: they are not precise and if someone can recreate your stolen biometric then you can never use it again.
Combining these factors into a multi-factor authentication scheme can increase security against the chance that any one of the factors is compromised. Multi-factor authentication must use two or more of these factors. Using two passwords, for example, is not sufficient.
Password Authentication Protocol (PAP)
The best-known authentication method is the use of reusable passwords. This is known as the password authentication protocol, or PAP. The system asks you to identify yourself (your login name) and then enter a password. If the password matches that which is associated with the login name on the system then you’re authenticated.
One problem with the protocol is that if someone gets hold of the password file on the system, then they have all the passwords. The common way to thwart this is to store hashes of passwords instead of the passwords themselves. This takes advantage of the one-way property of the hash. To authenticate a user, check if hash(password) = stored_hashed_password. If an intruder gets hold of the password file, they’re still stuck since they won’t be able to reconstruct the original password from the hash. They’ll have to resort to an exhaustive search or a dictionary attack to search for a password that hashes to the value in the file. An exhaustive search may take a prohibitively long time.
A dictionary attack is an optimization of the search that tests common passwords, including dictionary words and common letter-number substitutions. An intruder does not need to perform a search for each password to find a matching hash. Instead, the results of an exhaustive or dictionary search can be stored and searched quickly to find a corresponding hash in a password file. These are called precomputed hashes. To guard against this, a password is concatenated with a bunch of extra random characters, called salt. These characters make the password substantially longer and a table of precomputed hashes insanely huge and hence not practical to use. The salt is not a secret – it is stored in plaintext in the password file in order to validate a user’s password. Its only function is to make using precomputed hashes impractical and ensure that even identical passwords do generate the same hashed results.
The other problem with reusable passwords is that if a network is insecure, an eavesdropper may sniff the password from the network. A potential intruder may also simply observe the user typing a password. To thwart this, we can turn to one-time passwords. If someone sees you type your credentials or read them from the network stream, it won’t matter because that information will be useless for future logins.
CHAP Authentication
The Challenge-Handshake Authentication Protocol (CHAP) is an authentication protocol that allows a server to authenticate a user without sending a password over the network.
Both the client and server share a secret (such as a password). A server creates a random bunch of bits (called a nonce) and sends it to the client (user) that wants to authenticate. This is the challenge.
The client identifies itself and sends a response that is the hash of the shared secret combined with the challenge. The server has the same data and can generate its own hash of the same challenge and secret. If the hash matches the one received from the client, the server is convinced that the client knows the shared secret and is therefore legitimate.
An intruder who sees this hash cannot extract the original data. An intruder who sees the challenge cannot create a suitable hashed response without knowing the secret. Note that this technique requires passwords to be accessible at the
server and the security rests on the password file remaining secure.
Time-based: TOTP
With the Time-based One Time Password (TOTP) protocol, both sides share a secret key. To authenticate, a user runs the TOTP function to create a one-time password. The TOTP function is a password created as a hash of a shared secret key and the time of day. The service, who also knows the secret key and time, can generate the same hash and thus validate the value presented by the user.
TOTP is often used as a second factor (proof that you have some device with the secret configured in it) in addition to a password. The protocol is widely supported by companies such as Amazon, Dropbox, Wordpress, Microsoft, and Google.
Public key authentication
Public key authentication relies on the use of nonces. A nonce is simply a randomly-generated bunch of bits and is sent to the other party as a challenge for them to prove that they are capable of encrypting something with a specific key that they possess. The use of a nonce is central to public key authentication.
If Alice wants to authenticate Bob, she needs to have Bob prove that he possesses his private key (private keys are never shared). To do this, Alice generates a nonce (a random bunch of bits) and sends it to Bob, asking him to encrypt it with his private key. If she can decrypt Bob’s response using Bob’s public key and sees the same nonce, she will be convinced that she is talking to Bob because nobody else will have Bob’s private key. Mutual authentication requires that each party authenticate itself to the other: Bob will also have to generate a nonce and ask Alice to encrypt it with her private key.
Identity binding: digital certificates
While public keys provide a mechanism for asserting integrity via digital signatures, they are themselves anonymous. We’ve discussed protocols where Alice uses Bob’s public key but never explained how she can be confident that the key really belongs to Bob and was not presented by an adversary. Some form of identity binding of the public key must be implemented for you to know that you really have my public key instead of someone else’s.
Digital certificates provide a way to do this. A certificate is a data structure that contains user information (called a distinguished name) and the user’s public key. To ensure that nobody changes any of this data, this data structure also contains a signature of the certification authority. The signature is created by taking a hash of the rest of the data in the structure and encrypting it with the private key of the certification authority. The certification authority (CA) is an organization that is responsible for setting policies of how they validate the identity of the person who presents the public key for encapsulation in a certificate.
To validate a certificate, you hash all the certificate data except for the signature. Then you would decrypt the signature using the public key of the issuer. If the two values match, then you know that the certificate data has not been modified since it has been signed. The challenge now is how to get the public key of the issuer. Public keys are stored in certificates, so the issuer would also have a certificate containing its public key. The certificates for many of the CAs are preloaded into operating systems or, in some cases, browsers.
Transport Layer Security (Secure Sockets Layer)
Transport Layer Security (TLS), an evolution of Secure Sockets Layer (SSL), that provides authentication, integrity, and encrypted communication while preserving the abstraction of a sockets interface to applications. An application sets up a TLS session to a service. After that, it simply sends and receives data over a socket just like it would with the normal sockets-based API that operating systems provide. The programmer does not have to think about network security.
Any TCP-based application that may not have addressed network security can be security-enhanced by simply using TLS. For example, the standard email protocols, SMTP, POP, and IMAP, all have TLS-secured interfaces. Web browsers use HTTP, the Hypertext Transfer Protocol, and also support HTTPS, which is the exact same protocol but uses A TLS connection.
TLS provides these key components for secure communication:
- Data encryption
- Symmetric cryptography is used to encrypt data.
- Data integrity
- Ensure that we can detect if data in transit has not been modified. TLS includes a MAC with transmitted data.
- Authentication
- TLS provides mechanisms to authenticate the endpoints prior to sending data. Authentication is optional and can be unidirectional (the client may just authenticate the server), unidirectional (each side authenticates the other), or none (in which case we just exchange keys but do not validate identities).
- Key exchange
- After authentication, TLS performs a key exchange so that both sides can obtain random shared session keys. TLS creates separate keys for each direction of communication (encryption keys for client-to-server and server-to-client data streams) and separate keys for data integrity (MAC keys for client-to-server and server-to-client streams).
- Interoperability & evolution
- TLS was designed to support many different key exchange, encryption, integrity, & authentication protocols. The start of each session enables the protocol to negotiate what protocols to use for the session.
These features are implemented in two sub-protocols within TLS:
- (1) Authentication and key exchange
- Authentication uses public key cryptography with X.509 certificates to authenticate a system. Both the client and server can present their X.509 digital certificates. TLS validates the signature of the certificate. A user authenticates by signing a hash of a set of messages with their private key. With web sites, most commonly only the server presents a certificate, so only the client can validate the server. After a secure session is set up, the service will often use some other protocol to authenticate a user, such as the password authentication protocol.
- Key exchange supports several options. Ephemeral Diffie-Hellman key exchange is the most common since it supports the efficient generation of shared keys and there is no long-term key storage. TLS can accommodate other key exchange techniques as well, including public key-based key exchange and pre-shared static keys.
- (2) Communication
- Data encryption uses symmetric cryptography and supports a variety of algorithms, including AES, AES, ARIA, and ChaCha20. AES is the Advanced Encryption Standard. ARIA is a South Korean standard encryption algorithm that is similar to AES. ChaCha20 is an encryption algorithm that is generally more efficient than AES on low-end processors.
- Data integrity is provided by a message authentication code (MAC) that is attached to each block of data. TLS allows the choice of several, including HMAC-MD5, HMAC-SHA1, HMAC-SHA256/384, and Poly1305.