Disclaimer: This study guide attempts to touch upon the most important topics that may be covered on the exam but does not claim to necessarily cover everything that one needs to know for the exam. Finally, don't take the one hour time window in the title literally.
Last update: Tue Sep 30 22:42:42 2025
Foundations of Computer Security
Key terms
Computer security protects systems and data from unauthorized access, alteration, or destruction. The CIA Triad summarizes its core goals: confidentiality, integrity, and availability.
Confidentiality: Only authorized users can access information. Related terms:
-
Privacy: control over personal information.
-
Anonymity: hiding identity.
-
Secrecy: concealing the existence of information.
-
Exfiltration: unauthorized transfer of data out of a system.
Integrity: Ensures accuracy and trustworthiness of data and systems. Integrity includes data integrity (no unauthorized changes), origin integrity (verifying source), recipient integrity (ensuring correct destination), and system integrity (software/hardware functioning as intended). Authenticity is tied to integrity: verifying origin along with correctness.
Availability: Ensures systems and data are usable when needed.
Threats include DoS (Denial of Service) and DDoS (Distributed Denial of Service) attacks, hardware failures, and failed backups.
System Goals
Security systems aim at prevention, detection, and recovery. Prevention stops attacks, detection identifies them, and recovery restores systems after incidents. Forensics investigates what happened. Defense in Depth is a layered strategy that applies multiple overlapping defenses — technical, procedural, and sometimes physical. If one control fails, others still provide protection.
Policies, Mechanisms, and Assurance
-
Policies: Define what is allowed.
-
Mechanisms: Enforce policies.
-
Technical mechanisms: include operating system controls and cryptography.
-
Procedural mechanisms: include audits, ID checks, and separation of duties.
-
-
Assurance: Confidence that policies and mechanisms are correctly implemented.
-
A principal is an entity that can be uniquely identified and authenticated.
-
A subject is the active process acting on behalf of a principal.
-
All security depends on assumptions about users, environments, and trusted components.
Security Engineering and Risk Analysis
Security engineering balances cost, usability, and protection.
Risk analysis evaluates asset value, likelihood of attack, and potential costs. Tradeoffs matter: too much protection may reduce usability or become too costly.
Trusted Components and Boundaries
-
The Trusted Computing Base (TCB) is all the hardware, firmware, and software essential to enforcing security.
-
A trust boundary is where data passes between trusted and untrusted entities.
-
Supply chain security is critical: a trusted vendor can become an attack vector.
Human Factors People remain the weakest link. Weak passwords, poor training, and misaligned incentives undermine protection.
-
Security theater: Measures that look protective but add little real security.
-
Weakest link: Security is only as strong as the most vulnerable component.
Threats, Vulnerabilities, and Attacks
A vulnerability is a weakness in software, hardware, or configuration. Examples include buffer overflows, default passwords, and weak encryption. Hardware-level flaws include Spectre, Meltdown, and Rowhammer.
An exploit is a tool or technique that leverages a vulnerability. An attack is the execution of an exploit with malicious intent.
An attack vector is the pathway used to deliver an exploit, such as email, websites, USB drives, or open network ports. The attack surface is the total set of possible entry points.
Not all vulnerabilities are technical. Social engineering manipulates people into granting access or revealing information. Phishing, spear phishing, pretexting, and baiting are common techniques. (This will be explored in more detail later in the course.)
A threat is the possibility of an attack, and a threat actor is the adversary who may carry it out. One useful classification, described by Ross Anderson, distinguishes threats as disclosure (unauthorized access), deception (false data), disruption (interruptions), and usurpation (unauthorized control). Related concepts include snooping, modification, masquerading, repudiation, denial of receipt, and delay.
The threat matrix distinguishes between opportunistic vs. targeted attacks and unskilled vs. skilled attackers (from script kiddies to advanced persistent threats).
The Internet amplifies risk by enabling action at a distance, anonymity, asymmetric force, automation at scale, global reach, and lack of distinction between malicious and normal traffic.
A botnet is a network of compromised machines controlled via a command and control server, used for spam, phishing, cryptocurrency mining, credential stuffing, or DDoS attacks.
Adversaries and Cyber Warfare
Behind every attack is an adversary. Adversaries differ in their goals, risk tolerance, resources, and expertise. Types include:
-
Hackers:
-
White hats: defensive security researchers who find and report flaws.
-
Black hats: malicious attackers seeking profit or disruption.
-
Gray hats: operate in a legal or ethical gray zone, sometimes disclosing flaws responsibly and sometimes not.
-
-
Criminal groups: organized gangs running fraud, ransomware, or “malware-as-a-service.”
-
Malicious insiders: employees or contractors who abuse legitimate access.
-
Hacktivists: politically or socially motivated attackers.
-
Spies: industrial or state-backed espionage actors.
-
Press or politicians: may try to obtain sensitive information to gain an advantage, but are usually risk-averse due to reputational or legal consequences.
-
Police and law enforcement: have legal authority (e.g., warrants, evidence seizure), but can also overstep through illegal searches or surveillance.
-
Nation-states: governments capable of long-term, well-funded operations.
Economic incentives sustain underground markets where exploits, botnets, and stolen data are sold. Zero-day vulnerabilities can fetch high prices in closed broker markets. In contrast, bug bounty programs reward researchers for legal disclosure.
Advanced Persistent Threats (APTs) are advanced in their methods, persistent in maintaining access, and threatening in their ability to bypass defenses. They are typically state-backed and operate over months or years with stealth and patience.
Cyber warfare involves state-sponsored attacks on critical infrastructure and military systems.
Countermeasures include government and industry cooperation, international botnet takedowns, and intelligence sharing.
The implication is that cybersecurity affects all levels: national security, corporate security, and personal security. Cyber warfare blurs the line between peace and conflict. Attribution is difficult, and critical infrastructure, businesses, and individuals alike are potential targets.
Tracking Vulnerabilities and Risks
Why track vulnerabilities?
Early vulnerability reporting was inconsistent. The CVE system (1999) introduced standardized identifiers. CVSS added a way to score severity. Together, they form the backbone of how vulnerabilities are shared and prioritized.
- CVE (Common Vulnerabilities and Exposures)
- A unique identifier assigned to publicly disclosed vulnerabilities. Example: CVE-2021-44228 (Log4Shell).
- CVSS (Common Vulnerability Scoring System)
- A 0–10 scale for rating the severity of vulnerabilities, based on exploitability and impact. Scores are grouped into categories from Low to Critical.
- Attribution challenges
- Attackers obscure their origins, reuse tools, share infrastructure, and sometimes plant false flags. This makes it difficult to know with certainty who is behind an attack.
- APT (Advanced Persistent Threat)
- Well-funded, skilled groups (often state-backed) that carry out prolonged, targeted campaigns. Advanced = may use custom malware, zero-days, or sophisticated tradecraft; Persistent = long-term stealthy presence; Threat = ability to bypass defenses.
Symmetric Cryptography
Foundations of Symmetric Cryptography
Why cryptography? Protect against passive adversaries (eavesdropping) and active adversaries (injection, modification, replay).
Goals:
-
Confidentiality: Ensure information is visible only to authorized parties. Symmetric encryption provides confidentiality, while integrity and authenticity require additional mechanisms such as MACs or AEAD.
-
Integrity: Guarantee information is accurate and unmodified.
-
Authenticity: Assurance that data or a message comes from the claimed source.
-
Non-repudiation: Preventing a sender from denying authorship of a message later (a property of public-key systems).
Core terms. Plaintext (original readable data), ciphertext (the scrambled, unreadable output), cipher (the algorithm), key (the secret parameter that selects one transformation), encryption (converts plaintext into ciphertext), decryption (converts ciphertext back into plaintext), symmetric encryption (same secret key shared by sender and receiver). A cryptosystem is the complete framework that defines how encryption and decryption are carried out.
Kerckhoffs's Principle. A system must remain secure even if everything about it is public except the key. Prefer standardized, openly analyzed algorithms; avoid secrecy of design.
Schneier's Law. Anyone can design a cipher they cannot break; confidence comes only after broad, sustained public scrutiny.
Classical Ciphers (Substitution and Transposition)
Classical ciphers use hand (pencil-and-paper) methods that illustrate substitution (change symbols) and transposition (reorder symbols).
Caesar cipher. Shift letters by a fixed amount; trivial to brute force and defeat via frequency counts.
Monoalphabetic substitution. Any fixed mapping between characters; the keyspace is large (\(26!\)) but still breakable because the statistical structure of a language shows up in ciphertext. In English, E, T, A, O, I, N are common; Z, Q, X, J are rare; frequent digraphs TH, HE, IN and trigrams THE, AND stand out.
Frequency analysis. A cryptanalytic technique: compare the frequencies of single letters, digraphs, and trigraphs in ciphertext with the known statistics of a language to recover the substitution.
Polyalphabetic substitution cipher.
Uses multiple substitution alphabets instead of just one. The cipher switches between different mappings during encryption so that the same plaintext letter can encrypt to different ciphertext letters, breaking simple frequency analysis.
-
Alberti's cipher disk (1460s). Two concentric alphabets; rotate the inner disk to switch substitution alphabets periodically mid-message. This polyalphabetic encryption that blurs single-letter statistics, making frequency analysis more challenging.
-
Vigenère cipher (16th c.). Repeat a keyword over the plaintext; each key letter selects a specific Caesar shift alphabet (via a table lookup of plaintext row vs. key column). The same plaintext letter can encrypt differently depending on its alignment with the key.
-
Breaking Vigenère. Even though Vigenère is a polyalphabetic cipher, the substitution alphabet repeats as a function of the key length: find repeated ciphertext fragments; measure gaps; factor to guess key length; split text into that many streams and solve each stream as a Caesar cipher by frequency.
Transposition ciphers. Preserve letters, change order.
-
Scytale: wrap a strip around a rod; write along the rod; read off unwrapped.
-
Columnar transposition: write plaintext in rows under a keyword; read columns in keyword order. Padding (often X) fills incomplete rows and can leak hints.
Some later ciphers, like Playfair and ADFGVX combine substitution and transposition.
Lessons from classical cryptosystems: Combining substitution and transposition helps, but hand systems leave structure to exploit. These methods motivated mechanized cryptography (rotor machines) and later, theory-driven designs that deliver confusion and diffusion deliberately.
Mechanized Cryptography (Rotor Machines)
Machines automated polyalphabetic substitution and enhanced its security, but the complexity alone did not guarantee security.
Rotor machine. A stack of wired wheels applies a changing substitution on each keystroke; the rightmost rotor steps like an odometer, so the mapping evolves continuously. The Enigma, used by the Germans during World War II, is the most famous of the rotor machines.
Enigma workflow. Type a letter, current rotor positions map it through three rotors, a reflector sends it back through the rotors on a different path, and a lamp shows the ciphertext. Same setup decrypts. The use of a reflector implies that no letter ever encrypts to itself, which weakens the security of the system (i.e., if you see an 'A' in the ciphertext, then you know it can't be an 'A').
Keyspace (why it's huge). Choose and order 3 rotors from 5 (60 ways), choose 26 starting positions for each rotor (\(26^3=17{,}576\)), and set a plugboard that swaps letter pairs (about \(10^{14}\) possibilities). The combined space exceeds \(10^{23}\).
Strength vs. practice. The rotors and a repetition period of 17,576 characters suppressed simple frequency patterns, but design quirks and operating procedures leaked structure. Analysts used cribs (predictable text like headers), the "no self-encryption" property, operator mistakes (repeating message keys, key reuse), traffic habits, captured codebooks, and electromechanical search (predecessors to computers: Polish bombas, British bombes designed by Turing and Welchman).
Mechanization increased complexity and throughput but not the proof of security. Operational discipline and small structural properties mattered.
Shannon, Perfect Secrecy, and Randomness
One-time pad (OTP). Encrypt by XORing plaintext with a truly random, one-time key of equal length: \(C = P \oplus K \quad \text{and} \quad P = C \oplus K\). This gives perfect secrecy if the key is random, at least as long as the message, independent of the plaintext, and never reused.
Why OTPs are rarely used. Generating, distributing, and storing keys as long as the message is impractical.
Shannon’s contribution. Two design properties for strong ciphers:
-
Confusion: Nonlinear substitution hides the relationship between key and ciphertext.
-
Diffusion: Structured mixing spreads each input bit across many output bits over multiple rounds.
Perfect vs. computational security. Perfect secrecy means the ciphertext reveals nothing about plaintext (like the OTP). In practice, we aim for computational security: no feasible attack with realistic resources.
Random vs. pseudorandom.
-
Random values come from physical processes (e.g., thermal noise, hardware randomness). They are unpredictable but can be hard to generate in large amounts.
-
Pseudorandom values are generated by an algorithm from a small random seed. They only appear random, but if the generator is weak, attackers may predict outputs. For cryptogrfaphy, we use a CSPRNG (cryptographically secure pseudorandom number generator). This is a pseudorandom generator designed so its outputs are computationally indistinguishable from true random and unpredictable without knowing the seed. Operating systems provide CSPRNGs and cryptographic libraries use them for keystreams and initialization vectors.
Shannon entropy (concept). Entropy measures unpredictability. High entropy keys resist guessing; good encryption removes observable structure so ciphertext looks random and does not compress well.
Modern Symmetric Cryptography
There are two families of ciphers in modern symmetric cryptography: block ciphers (encrypt fixed-size blocks) and stream ciphers (keystream ⊕ plaintext). Within block ciphers, there are two main designs: Feistel networks and substitution-permutation networks (SPNs)
Block ciphers
A block cipher is a keyed permutation on fixed-size blocks of data. Different keys give different permutations. Block ciphers are the workhorses of symmetric cryptography.
Most block ciphers are iterative: they apply the same round transformation repeatedly. Each round uses a round key (subkey) derived from the master key.
Rounds typically combine nonlinear substitution (S-boxes) with permutation or mixing steps (P-boxes or matrix operations).
Design goals:
-
Confusion: Obscure the relationship between the key and ciphertext.
-
Diffusion: Spread the influence of each plaintext bit across many ciphertext bits.
-
Avalanche effect: A desired property of confusion and diffusion: flipping a single input bit should change about half the output bits, destroying patterns quickly.
Substitution-permutation networks (SPNs)
-
Pattern: Substitute bytes via small lookup tables (S-boxes), mix/permute to spread changes, then XOR a round key: \(\text{state}\leftarrow \text{state}\oplus K_i\).
-
Decryption: Apply inverse substitution and inverse mixing with round keys in reverse order.
-
Example: AES.
Feistel networks
-
Pattern: Split block \((L_i,R_i)\). One round: \[L_{i+1}=R_i,\qquad R_{i+1}=L_i \oplus F(R_i,K_i).\]
-
Property: Always invertible even if \(F\) is not; decryption reuses \(F\) with round keys reversed.
-
Example: DES.
DES and 3DES (historic examples)
DES (1977): The first U.S. encryption standard. A 16-round Feistel cipher with a 64-bit block and a 56-bit key. Hardware-friendly and influential, but the relatively short key length eventually made it vulnerable to brute force attacks.
-
Main issue: Key length (56-bit) is brute-forceable.
-
Triple DES (3DES) (EDE: encrypt-decrypt-encrypt):
-
EDE1 (1 56-bit key): \(E_K(D_K(E_K(\cdot)))\): backward compatibility; no strength gain.
-
EDE2 (2 56-keys = 112-bit key): \(E_{K_1}(D_{K_2}(E_{K_1}(\cdot)))\).
-
EDE3 (3 56-bit keys = 168-bit key): \(E_{K_1}(D_{K_2}(E_{K_3}(\cdot)))\).
-
-
Status: Legacy algorithm; small block size, slow in software.
AES (modern standard)
-
Block/key sizes: AES is the current global standard. It uses a 128-bit block and keys of 128, 192, or 256 bits, with 10, 12, or 14 rounds accordingly.
-
Round steps: SubBytes (byte lookup table), ShiftRows (rearrange), MixColumns (mix bytes in a column), AddRoundKey (XOR).
-
Why it's a good default: Strong safety margin, fast with crypto hardware extensions on many processors; good constant-time software exists.
Modes of operation
Block cipher modes define how to use a block cipher on long messages. They specify how blocks are linked and how per-message values (IVs or nonces) are used to keep encryptions distinct.
-
ECB (Electronic Codebook): Each block encrypted independently; identical plaintext blocks ⇒ identical ciphertext blocks; leaks structure and allows block cut-and-paste. Do not use for data.
-
CBC (Cipherblock Chaining): XOR plaintext with previous ciphertext block, then encrypt. First block uses an IV (initialization vector - must be fresh and unpredictable). Needs padding. Confidentiality only; beware implementation bugs that leak information through different error responses.
-
CTR (Counter): Encrypt a per-message counter sequence to produce a keystream; XOR with plaintext. A unique per-message value is required for each encryption that uses the same key (often called a nonce or initialization vector). No padding; parallelizable; random access friendly.
-
GCM (Galois Counter Mode): CTR for encryption plus a tag to detect tampering (AEAD). Requires a unique per-message value (typically 96 bits).
AEAD: Authenticated Encryption with Associated Data: adding integrity checks to the encryption process. AEAD systems encrypt and return a tag so the receiver can detect and reject modified ciphertexts. We will cover integrity later; read AEAD as "encryption with a built-in tamper check." Think of it as generating ciphertext and a message authentication code (MAC) in one pass over the data.
Common pitfalls with cipher modes:
-
ECB on data leaks patterns. Avoid using it.
-
Per-message input errors:
-
CBC: initialization vector (IV) must be fresh and unpredictable (random).
-
CTR/GCM/ChaCha20: per-message value (nonce or initialization vector) must be unique for each message encrypted with the same key (never reused, uniqueness is the rule).
Stream ciphers
Stream ciphers model the idea of the one-time pad: generate a keystream and XOR it with plaintext. Instead of a pad as long as the message, they expand a short key and a nonce into a long pseudorandom sequence.
-
Model: Generate a keystream, then XOR: \(C=P\oplus \text{keystream}\) and \(P=C\oplus \text{keystream}\).
-
Rule: Under one key, never reuse the per-message input that selects the keystream.
ChaCha20 is the most popular stream cipher today. To avoid the problem of reusing a key, ChaCha20 takes as input to its keystream generator a secret key as well as a random nonce (a random bunch of bits).
-
What it does: Runs 20 rounds of add/rotate/XOR on a 512-bit internal state keyed by a 256-bit key and a per-message random value (nonce) to output a pseudorandom keystream; then XOR the keystream with the plaintext.
-
Why used: Fast on general CPUs and embedded systems, constant-time without tables, robust side-channel profile.
-
AEAD form: ChaCha20-Poly1305 (encryption + 16-byte integrity tag).
Most breakage with modern ciphers is via misuse (wrong mode, reused per-message input), not broken ciphers.
Quick reference
Name | Type | Block/stream | Typical key sizes | Notes |
---|---|---|---|---|
AES-128/192/256 | SPN block cipher | 128-bit block | 128/192/256 | Default choice; wide hardware support |
ChaCha20-Poly1305 | Stream + tag (AEAD) | Stream | 256 (cipher) | Fast on CPUs without AES-NI; embedded-friendly |
DES | Feistel block | 64-bit block | 56 (effective) | Legacy; brute-forceable |
3DES (EDE2/EDE3) | Feistel block | 64-bit block | 112/168 (nominal) | Legacy |
Equations to recognize (not memorize)
-
Feistel round: \(L_{i+1}=R_i,\; R_{i+1}=L_i \oplus F(R_i,K_i)\).
-
Stream XOR: \(C=P\oplus \text{keystream}\) and \(P=C\oplus \text{keystream}\).
Principles of Good Cryptosystems
Foundations
-
Kerckhoffs's principle: Assume the attacker knows everything except the key. Publish algorithms; keep keys secret.
-
Shannon: Confusion (hide key influence) + diffusion (spread local changes). Achieved by simple rounds that substitute, mix, and add key material.
Security properties
-
Random-looking ciphertext: No patterns, no bias, does not compress. If identical plaintext blocks create repeated ciphertext, the mode/usage is wrong.
-
No shortcuts to decryption without the key: Best generic attack is brute force over keys, not algebra on structure.
-
Adequate key space; no weak keys: Each extra bit doubles search. \(2^{128}\) is beyond plausible brute force. Avoid key classes with special structure.
-
Resistance across attack models: Aim for confidentiality under CPA (chosen plaintext attack); prefer AEAD so systems remain safe even when attackers can submit modified ciphertexts (CCA - chosen ciphertext attack settings).
-
Side-channel robustness: Defend against timing/cache, power/electromagnetic emissions, and fault attacks with constant-time code, hardware instructions, masking/blinding, and no secret-indexed table lookups.
-
Composability: Security should survive real use (chunking, retries, many senders). Use AEAD, keep per-message inputs correct, and separate keys by purpose.
Practical requirements
-
Efficiency: Match the hardware. AES on architectures that accelerate it; ChaCha20 where AES support is weak. Efficient designs make constant-time code feasible.
-
Minimal expansion: Small, predictable overhead (e.g., a per-message input + a tag in GCM but not overhead that is a multiple of the plaintext size). Avoid custom paddings and ad-hoc headers.
-
Simplicity & universality: Few controls, clear defaults, mature libraries. Use high-level AEAD APIs; do not invent modes.
-
Public algorithms, open analysis: Prefer widely vetted standards. "Proprietary crypto" is a red flag.
Keys (operational reminders)
-
Generation: Use the operating system's random or cryptographically-secure pseudorandom number generator (CSPRNG) seeded with a true random source. No ad-hoc randomness.
-
Handling: Protect in storage and use; rotate before per-key data limits; destroy on retirement.
-
Per-message input (random, non-secret parameter): Treat this as a first-class parameter. CBC needs a fresh, unpredictable IV. CTR/GCM/ChaCha20 need a value unique for a given key. This random data is referred to as a nonce or initialization vector.
Quick rules
-
Prefer AES-GCM with hardware support; otherwise ChaCha20-Poly1305.
-
Never use ECB for data.
-
Padding is needed for ECB/CBC when message length ≠ block size; it is not needed for CTR/GCM/ChaCha20.
-
Write constant-time code (no secret-dependent branches/lookups).
-
Keep algorithms public; keep keys secret.
Common pitfalls
-
Reusing the per-message input with the same key (CTR/GCM/ChaCha20).
-
Predictable IVs in CBC
-
Short keys, weak randomness, keys visible in logs/dumps.
-
Secret-indexed tables causing cache leaks.
Introduction to Cryptanalysis
What is cryptanalysis? The art and science of breaking cryptographic systems. Used by both attackers (malicious goals) and security researchers (testing and strengthening algorithms). Goal: recover plaintext from ciphertext without authorized access to the key.
Core assumptions: Analysts typically know the encryption algorithm (following Kerckhoffs's principle) and have some idea of content type (German text, machine code, file headers, etc.).
Attack models (what the analyst can access)
Brute force attack
Exhaustive key search: trying every possible key until meaningful plaintext emerges. Sets the security baseline: \(n\)-bit key requires at most \(2^n\) attempts (average \(2^{n-1}\)). Becomes infeasible with sufficient key length (128+ bits).
Ciphertext-only attack (COA)
The analyst has only encrypted messages with no corresponding plaintext. Extremely difficult against modern ciphers. Historically used frequency analysis against simple substitution. Modern secure ciphers resist COA by producing statistically random-looking ciphertext.
Known plaintext attack (KPA)
The analyst has matching plaintext-ciphertext pairs but cannot choose the plaintext. Can study correlations between inputs and outputs to find patterns. Example: Breaking Enigma using stolen plaintext-ciphertext pairs and predictable message formats.
Chosen plaintext attack (CPA)
The analyst can select specific plaintexts and observe the resulting ciphertext. Reflects scenarios where attackers influence message content. Enables systematic testing of cipher behavior and probing for structural weaknesses. Modern cipher design considers CPA resistance essential.
Chosen ciphertext attack (CCA)
The analyst can submit ciphertext for decryption and observe plaintext or error responses. Example: Implementation bugs in CBC mode where systems respond differently to valid vs. invalid padding, eventually revealing complete plaintext. Highlights why confidentiality alone is insufficient—need authenticated encryption (AEAD).
Cryptanalytic techniques
Differential cryptanalysis
Examines how differences in plaintext inputs propagate through a cipher to produce differences in ciphertext outputs. Seeks non-random behavior revealing information about internal structure or key bits. Typically requires chosen plaintext access. Modern ciphers are specifically designed to resist differential attacks.
Linear cryptanalysis
Attempts to find linear approximations (XOR equations) connecting plaintext bits, ciphertext bits, and key bits with probability significantly different from 50%. Works effectively with known plaintext. Complements differential techniques. Both approaches rarely recover complete keys but can reduce brute force search space.
Side-channel analysis
Exploits physical information leaked during cryptographic operations: timing, power consumption, electromagnetic emissions, cache access patterns, fault injection. Attacks the implementation rather than the mathematical algorithm.
Examples:
-
Cache timing: AES table lookups leak key information via memory access patterns
-
Power analysis: Different operations consume different amounts of electricity, revealing processed data
-
Fault injection: Introducing errors during computation to extract secrets from faulty outputs
Defenses: Constant-time implementations, masking/blinding, hardware countermeasures, avoiding secret-indexed table lookups.
Key points
-
Good modern ciphers resist all standard attack models (COA, KPA, CPA, CCA) and major cryptanalytic techniques (differential, linear)
-
Implementation matters — side-channel attacks often succeed where mathematical attacks fail
-
Cryptanalysis serves defense — understanding attack techniques helps build stronger systems
-
Open scrutiny is essential — algorithms gain trust through surviving public analysis, not through secrecy
Public Key Cryptography and Integrity
Key Distribution Problem
The key distribution problem is the challenge of sharing a secret key securely over an insecure channel. Public key cryptography was developed to solve this.
One-Way Functions
A one-way function is easy to compute in one direction but computationally infeasible to reverse.
- Examples: computing
g^x mod p
is easy, but findingx
(the discrete logarithm) is hard. The middle squares method, an early pseudorandom number generator, also illustrates the idea of one-way computation.
Trapdoor Function
A trapdoor function is a one-way function that becomes easy to invert if you know a secret value (the trapdoor). Public key cryptography relies on building trapdoor functions.
Public Key Cryptography
Public key cryptography uses a pair of keys:
-
The public key is shared and used for encryption or signature verification.
-
The private key is kept secret and used for decryption or creating signatures.
RSA
RSA is based on the difficulty of factoring large numbers that are the product of two primes.
- Encryption and decryption use modular exponentiation: me mod n.
Elliptic Curve Cryptography (ECC)
ECC is based on the difficulty of solving discrete logarithms on elliptic curves.
- ECC achieves the same security as RSA with shorter key sizes.
Why not use public key cryptography for everything?
Public key cryptography is used for key exchange and signatures, not bulk encryption because of:
-
Performance: far slower than symmetric ciphers.
-
Ciphertext expansion: ciphertexts larger than plaintexts.
-
Security: raw RSA/ECC is unsafe: mathematical relations from the plaintext are preserved.
Cryptographic Hash Functions
A cryptographic hash function maps input of arbitrary length to a fixed-size output. Its key properties are:
-
Fixed length: The output size does not depend on input size.
-
Preimage resistance: It is hard to find a message that hashes to a given value.
-
Second preimage resistance: It is hard to find another message with the same hash.
-
Collision resistance: It is hard to find any two messages that hash to the same value.
-
Avalanche effect: Small input changes produce large, unpredictable changes in output.
Common hash functions: SHA-1 (obsolete), MD5 (obsolete), SHA-2 (SHA-256, SHA-512), bcrypt (designed to be slow).
Hash Collisions
A hash collision occurs when two different inputs produce the same hash.
-
Pigeonhole principle: More possible inputs than outputs guarantees collisions.
-
Birthday paradox: collisions occur sooner than expected, around \(\sqrt{N}\) attempts for an N-size output space.
Hash Collisions
A hash collision occurs when two different inputs produce the same hash output. Because hash functions compress large inputs into a fixed-size output, collisions are guaranteed in theory. The question is how likely they are in practice.
-
Pigeonhole principle: Since there are more possible inputs than outputs, some inputs must map to the same output. This is unavoidable for any fixed-length hash.
-
Birthday paradox: The probability of finding a collision grows faster than intuition suggests. Instead of needing about N trials to find a collision in an N-element space, it takes only about \(\sqrt{N}\).
Cryptographic hashes use large output sizes, making collisions astronomically unlikely. With SHA-256 (256-bit output), a collision is expected after about \(2^{128}\) attempts. \(2^{128}\) is about \(3.4 \times 10^{38}\), an unimaginably large value. To put this in perspective: the odds of winning the Powerball jackpot once are about 1 in \(3 \times 10^{8}\). To match the improbability of finding a SHA-256 collision, you’d have to win the Powerball about five times in a row.
This shows why collisions are considered practically impossible for modern cryptographic hashes, unless the function itself is broken (as with MD5 and SHA-1, where researchers found methods to create collisions).
Message Authentication Code (MAC)
A Message Authentication Code (MAC) is a keyed hash: a function that takes both a message and a secret key as input. Only someone with the secret key can produce or verify the MAC.
This ensures integrity (the message has not been modified) and origin assurance (the message was generated by someone who knows the shared key). Unlike digital signatures, a MAC does not provide proof to outsiders of who created the message, since both sender and receiver share the same key.
Example: Alice and Bob share a secret key. Alice sends Bob the message PAY $100
and its MAC. Bob recomputes the MAC on the message using the shared key. If his result matches Alice’s, he knows the message was not altered and must have come from someone with the key.
Constructions of MACs:
-
HMAC: uses a cryptographic hash with the key mixed into the input.
-
CBC-MAC: uses a block cipher in Cipher Block Chaining mode to produce a keyed hash.
-
AEAD (Authenticated Encryption with Associated Data): combines encryption and integrity in one step, generating ciphertext and an authentication tag (similar to a MAC). GCM and Poly1305 are common AEAD integrity components.
Digital Signature
A digital signature is the public-key equivalent of a MAC, but it uses asymmetric keys instead of a shared secret.
Concept:
Signing: the sender creates a hash of the message and encrypts that hash with their private key.
Verification: the receiver decrypts the signature with the sender’s public key and compares it with a newly computed hash of the message. If they match, the message has not been altered and must have come from the claimed sender.
-
Unlike MACs, digital signatures provide authentication and non-repudiation: only the holder of the private key could have generated the signature, and the sender cannot later deny it.
Example: Alice sends Bob the message TRANSFER $500
along with its signature. Bob verifies the signature using Alice’s public key. If it checks out, Bob knows the message is intact and was created by Alice.
Digital signature algorithms: In practice, specialized algorithms exist just for signing. Examples include DSA, ECDSA, and EdDSA. These algorithms are applied to the hash of the message rather than the message itself for efficiency.
MACs vs. Signatures
-
MACs: symmetric, efficient, no non-repudiation.
-
Signatures: asymmetric, slower, provide authentication and non-repudiation.
X.509 Certificates
An X.509 certificate (often just called a digital certificate) binds a public key to the identity of an individual, organization, or website.
A certificate contains information such as the subject’s name, the subject’s public key, the issuer’s identity, and a validity period.
A Certificate Authority (CA) digitally signs the certificate using its own private key. Anyone who trusts the CA can verify the certificate by checking that signature with the CA’s public key. Certificates provide assurance that a public key truly belongs to the entity it claims.
Example: When you visit https://example.com
, your browser receives the site’s X.509 certificate. It verifies that the certificate was signed by a trusted CA. If valid, your browser knows the server’s public key really belongs to that website.
Certificates are essential in TLS, where they allow clients to verify the identity of servers before exchanging keys.
Diffie-Hellman Key Exchange (DHKE)
DHKE allows two parties to establish a shared secret over an insecure channel.
Each party uses a private key and a corresponding public value to derive the shared secret. The algorithm is based on the one-way function ax mod p.
Elliptic Curve Diffie-Hellman (ECDH) uses elliptic curves but behaves the same way.
While general-purpose public key algorithms, such as RSA and ECC, can perform this function, DHKE is usually used because generating public and private keys is much faster and the algorithm is efficient in generating a common key.
Hybrid Cryptosystem
A hybrid cryptosystem combines public key cryptography with symmetric key cryptography. Public key methods are slow, so they are typically used only to exchange a session key. Once the session key is established, both sides switch to symmetric encryption (like AES) for the actual communication, which is much faster.
A session key is a temporary symmetric key that secures communication for the duration of one session. When the session ends, the session key is discarded. If an attacker records the traffic, they cannot decrypt it later without the specific session key that was used at the time.
Forward secrecy means that even if a long-term private key is stolen in the future, past communications remain secure. This works because each session uses its own independent session key. The long-term key is only used to authenticate or establish the exchange, not to encrypt bulk data.
An ephemeral key is a temporary public/private key pair generated for a single session (or even a single exchange). Ephemeral keys are often used in protocols like Ephemeral Diffie-Hellman (DHE). The shared secret derived from the ephemeral key exchange is used to create the session key.
Ephemeral keys and session keys are closely related: the ephemeral key exchange produces the session key. In some contexts, the term “ephemeral key” is used interchangeably with “session key,” but more precisely:
-
Ephemeral key: the temporary public/private key pair used during the key exchange.
-
Session key: the symmetric key that results from the exchange and is used to encrypt the actual communication.
Example: In TLS with Ephemeral Diffie-Hellman, the server and client each generate ephemeral keys for the exchange. From those, they derive a shared session key. The ephemeral keys are discarded after the session ends, ensuring forward secrecy.
Quantum Attacks
Quantum attacks use quantum algorithms that can break many current public key systems.
-
RSA is vulnerable because Shor’s algorithm can efficiently factor large integers.
-
Diffie-Hellman (DHKE) and Elliptic Curve Cryptography (ECC) are vulnerable because Shor’s algorithm can efficiently solve discrete logarithms, which are the basis of these systems. This includes protocols like ECDH and signature schemes like DSA, ECDSA, and EdDSA.
-
In short, any public key system based on integer factorization or discrete logarithms is broken by a large quantum computer.
-
Symmetric cryptography (AES, ChaCha20) and hash-based functions (SHA-2, SHA-3, HMAC) are not broken, but Grover’s algorithm can reduce their effective strength by half. Doubling key or hash sizes restores security.
New post-quantum algorithms have been recently developed (and continue to be developed) that are resistant to these quantum attacks.
Transport Layer Security (TLS)
TLS is the protocol that secures communication on the web. It combines several cryptographic tools:
-
Diffie-Hellman key exchange establishes a shared secret that becomes the basis for session keys.
-
X.509 certificates authenticate the server (and sometimes the client) by binding identities to public keys.
-
Public key cryptography verifies the certificates and authenticates the Diffie-Hellman exchange (the server signs its ephemeral D-H value with its private key, and the client checks it).
-
HMAC is used in the handshake to authenticate transcript messages and, through HKDF, to derive keys from the shared secret.
-
HKDF (HMAC-based Key Derivation Function) expands the Diffie-Hellman shared secret into multiple secure session keys (a different key for traffic in each direction).
-
Symmetric cryptography with AEAD (such as AES-GCM or ChaCha20-Poly1305) protects bulk data, providing both confidentiality and integrity for application traffic.
Authentication
Core Concepts
Identification, Authentication, Authorization
Identification is claiming an identity (such as giving a username). Authentication is proving that claim (such as showing a password or key). Authorization comes afterward and determines what actions the authenticated party may perform.
Pre-shared keys and Session keys
A pre-shared key is a long-term secret shared in advance between two parties and often used with a trusted server. A session key is temporary and unique to one session, which reduces exposure if the key is compromised.
Mutual authentication
Protocols often require both sides to prove they know the shared secret, not just one. This ensures that neither side is tricked by an impostor.
Trusted third party (Trent)
Some protocols rely on a trusted server that shares long-term keys with users and helps them establish session keys with each other.
Nonces, Timestamps, and Session identifiers
These are ways to prove freshness:
-
Nonces are random values used once.
-
Timestamps prove that a message is recent.
-
Session identifiers tie all messages to a specific run of a protocol.
Replay attack
An attacker may try to reuse an old ticket or session key. Without freshness checks, the receiver cannot tell the difference and may accept stale credentials.
Symmetric Key Protocols
These use a trusted third party (Trent) to distribute or relay session keys. Alice asks Trent for a key to talk to Bob, Trent provides one (encrypted with Alice's secret key) along with a ticket for Bob. The ticket contains the same session key encrypted with Bob's key. It's meaningless to Alice but she can send it to Bob, who can prove that he was able to decrypt it.
Needham–Schroeder
Adds nonces to confirm that a session key is fresh. Alice asks Trent for a key to talk to Bob and includes a random number (a nonce) in the request. Trent provides the session key along with a ticket for Bob. Alice knows it's a fresh response when she decrypts the message because it contains the same nonce. Weakness: if an old session key is exposed, an attacker can replay a ticket and impersonate Alice. Vulnerable if old session keys are exposed.
Denning–Sacco
Fixes the replay problem of an attacker replaying part of the protocol using an old compromised session key by using timestamps in tickets so Bob can reject replays. Bob checks that the timestamp is recent before accepting. This blocks replays but requires synchronized clocks.
Otway–Rees
Uses a session identifier carried in every message. Nonces from Alice and Bob, plus the identifier, prove that the session is unique and fresh. No synchronized clocks are required.
Kerberos
Kerberos is a practical system that puts the Denning–Sacco idea into operation: it uses tickets combined with timestamps to prevent replay attacks. Instead of sending passwords across the network, the user authenticates once with a password-derived key to an Authentication Server (AS). The AS returns a ticket and a session key for talking to the Ticket Granting Server (TGS).
With the TGS ticket, the user can request service tickets for any server on the network. Each service ticket contains a session key for the client and the server to use. To prove freshness, the client also sends a timestamp encrypted with the session key. The server checks the timestamp and replies with T+1.
This structure allows single sign-on: after authenticating once, the user can access many services without re-entering a password. Kerberos thus combines the principles of Denning–Sacco with a scalable ticketing infrastructure suitable for enterprise networks.
Properties:
-
Tickets contain session keys.
-
Passwords never travel on the network.
-
Timestamps prevent replay.
-
Widely used in enterprise single sign-on.
Public Key Authentication
Public key cryptography removes the need for Trent. If Alice knows Bob’s public key, she can create a session key and encrypt it for Bob. Certificates bind public keys to identities, solving the problem of trust. In TLS, which we studied earlier, the server’s certificate authenticates its key. An ephemeral exchange (e.g., Diffie-Hellman) creates fresh session keys that ensure forward secrecy: past sessions remain secure even if a long-term private key is later stolen.
Password-Based Protocols
PAP (Password Authentication Protocol)
The client sends the password directly to the server. This is insecure because the password can be intercepted and reused.
CHAP (Challenge–Handshake Authentication Protocol)
The server sends the client a random value called a nonce. The client combines this nonce with its password and returns a hash of the result. The server performs the same calculation to verify it matches. The password itself never crosses the network, and replay is prevented because each challenge is different. CHAP still depends on the password being strong enough to resist guessing. Unlike challenge-based one-time passwords (covered later), CHAP relies on a static password reused across logins, not a device that generates one-time codes.
How Passwords Are Stored
Systems do not store passwords in plaintext. Instead, they store hashes: one-way transformations of the password. At login, the system hashes the candidate password and compares it to the stored value. This prevents an immediate leak of all passwords if a file is stolen, but the hashes are still vulnerable to offline guessing.
Weaknesses of Passwords
Systems do not store passwords in plaintext because that would reveal every user’s secret to anyone who obtained the file. Instead, they store hashes of passwords and compare hashes at login.
Online vs. offline guessing
Online guessing means trying passwords directly against a login prompt. Defenses include rate-limiting and account lockouts. Offline guessing happens if an attacker steals the password file; they can test guesses at unlimited speed with no lockouts, which is much more dangerous.
Dictionary attack
The attacker tries a large list of likely passwords. Online, this means repeated login attempts. Offline, it means hashing guesses and comparing against stored hashes.
Rainbow table attack
The attacker precomputes tables of password→hash pairs for a given hash function. A stolen password file can be cracked quickly with a lookup. Rainbow tables only work if hashes are unsalted.
Credential stuffing
Attackers take username/password pairs stolen from one site and try them against other services, hoping users have reused the same password.
Password spraying
Instead of many guesses against one account, attackers try one or a few common passwords across many accounts, avoiding lockouts.
Mitigations
Defenses focus on making stolen data less useful and slowing attackers.
Salts
A random value stored with each password hash. Two users with the same password will have different hashes if their salts differ. Salts make rainbow tables impractical.
Slow hashing functions
Deliberately expensive functions (bcrypt, scrypt, yescrypt) make each hash slower to compute, which hinders brute-force guessing but is barely noticeable for a user login.
Transport protection
Passwords should always travel over encrypted channels (TLS) to prevent interception.
Multi-factor authentication
A second factor (OTP, push, passkey) prevents a stolen password from being enough on its own.
Password policies
Block weak or previously breached passwords, enforce rate limits, and monitor for unusual login activity.
One-Time Passwords (OTPs)
Static passwords can be replayed. One-time password systems generate a new password for each login. The goal is to never reuse a password.
There are four main forms of one-time passwords:
-
Sequence-based: derived from repeatedly applying a one-way function.
-
Challenge-based: derived from a random challenge from the server and a shared secret.
-
Counter-based: derived from a shared counter and a secret.
-
Time-based: derived from the current time slice and a secret.
S/Key (sequence-based)
S/Key creates a sequence of values by starting from a random seed and applying a one-way function to each successive value. The server stores only the last value in the sequence. At login, the user provides the previous value, which the server can verify by applying the function once. Each login consumes one value, moving backward through the sequence. Even if an attacker sees a value, they cannot work out earlier ones because the function cannot be reversed.
Challenge-based OTP
The server issues a fresh challenge, and the user’s token or app computes a one-time response from the challenge and a secret. Replay is useless because each challenge is unique. This differs from CHAP: in CHAP the secret is a static password reused across logins, while in challenge-based OTPs the secret is stored in a device that proves possession.
HOTP (counter-based)
The server and client share a secret key and a counter. Each login uses the next counter value to generate a code with HMAC. Both sides must stay roughly in sync.
TOTP (time-based)
Like HOTP, but the counter is the current time slice (commonly 30 seconds). Codes expire quickly, making replays useless. Widely used in mobile authenticator apps (like Google Authenticator).
Multi-Factor Authentication (MFA)
MFA requires factors from different categories: something you know (password), something you have (token or phone), and something you are (biometrics).
One-time passwords (OTPs)
Codes generated by a device or app that prove you have a specific token or phone. They are short-lived and cannot be reused, making them a common possession factor.
Push notifications
Send a prompt to the user’s phone but are vulnerable to MFA fatigue, where attackers flood the user with requests hoping for one accidental approval.
Number matching
Improves push MFA by requiring the user to type in a number shown on the login screen, linking the approval to that specific login attempt.
--
Passkeys
MFA improves security but still relies on passwords as one factor, and phishing can sometimes bypass second factors. Passkeys eliminate passwords entirely. Each device generates a per-site key pair: the private key stays on the device, and the public key is stored with the service. At login, the device signs a challenge with the private key. Passkeys are unique per site, cannot be phished, and can be synchronized across devices.
Adversary-in-the-Middle Attacks
An adversary-in-the-middle sits between a client and a server, relaying and possibly altering traffic.
Defenses:
-
Use TLS to encrypt and protect the channel.
-
Verify the server with certificates to prevent impersonation.
-
Use freshness checks (nonces, timestamps, identifiers) to block replay.
-
Issue short-lived codes and tokens so intercepted data quickly expires.
-
Train users to recognize and reject suspicious prompts, such as unexpected push notifications. ˜
Biometric authentication
Pattern Recognition and Thresholds
Biometric systems are pattern recognizers, not equality checks. Each sample is slightly different from the stored template, so the system computes a similarity score. Whether access is granted depends on a threshold: too strict, and genuine users are rejected; too loose, and impostors may slip through.
FAR, FRR, and ROC
Two common error rates define biometric performance. The False Accept Rate (FAR) is the chance an impostor is accepted. The False Reject Rate (FRR) is the chance a genuine user is denied. These trade off against each other as thresholds change.
A Receiver Operating Characteristic (ROC) plot shows this trade-off, and the Equal Error Rate is where the two curves meet.
Modalities and Features
Biometric traits may be physiological (fingerprints, iris, face) or behavioral (voice, typing rhythm, gait). Fingerprint systems look for minutiae such as ridge endings and bifurcations, while other modalities rely on different feature sets. Each trait balances robustness (how repeatable it is for the same person) against distinctiveness (how well it separates different people).
The Authentication Process
All biometric systems follow the same pipeline. During enrollment, the system captures high-quality samples to build a template. At login, the sensor records new data, and feature extraction turns it into a compact representation. Pattern matching compares this to the stored template, and the decision applies the threshold to accept or reject.
Challenges
Biometric systems face several practical challenges. Devices must be trusted, with secure data paths that cannot be tampered with.
Liveness detection is needed to block photos, prosthetics, or recordings.
Secure communication prevents injection of fake sensor data. Thresholds must be chosen carefully to balance FAR and FRR.
More fundamentally, biometrics lack compartmentalization: you cannot use a different fingerprint for each service. They do not offer true revocation once a trait is stolen.
Cooperative vs. Non-Cooperative Systems
In cooperative systems, the subject is willing and positioned correctly, as in unlocking a phone. Non-cooperative systems try to identify people at a distance or without consent, as in surveillance. These face greater challenges in data quality, error rates, and privacy.