What is a Cryptographic Hash Function?
A cryptographic hash function takes an input (message) and returns a fixed-size string of bytes, typically represented as a hexadecimal number. The output is called a hash, digest, or checksum.
Key properties of cryptographic hashes: deterministic (same input always produces same output), fast to compute, infeasible to reverse, small changes produce vastly different hashes (avalanche effect), and collision-resistant.
Hash Algorithm Comparison
MD5
• Output: 128 bits (32 hex characters)
• Status: BROKEN - collisions found in seconds
• Use for: Non-security checksums only
SHA-1
• Output: 160 bits (40 hex characters)
• Status: BROKEN - collision attack demonstrated
• Use for: Legacy systems only, being phased out
SHA-256
• Output: 256 bits (64 hex characters)
• Status: SECURE - recommended for most uses
• Use for: Digital signatures, certificates, Bitcoin
SHA-512
• Output: 512 bits (128 hex characters)
• Status: SECURE - highest security in SHA-2 family
• Use for: When maximum security is needed
Common Uses of Hash Functions
- Password storage: Store hashes instead of plain passwords (with salt!).
- File integrity: Verify downloads haven't been corrupted or tampered with.
- Digital signatures: Sign the hash of a document rather than the document itself.
- Blockchain: Bitcoin uses SHA-256 for mining and transaction verification.
- Data deduplication: Identify duplicate files by comparing hashes.
- Cache keys: Create unique identifiers for cached content.
Why MD5 and SHA-1 are Broken
MD5: In 2004, researchers demonstrated practical collision attacks. By 2008, researchers created a rogue CA certificate. Today, MD5 collisions can be generated in seconds on a laptop.
SHA-1: In 2017, Google and CWI Amsterdam published "SHAttered," demonstrating the first practical SHA-1 collision. While more expensive than MD5 attacks, SHA-1 is no longer considered secure for cryptographic purposes.
Hashing vs Encryption
- Hashing: One-way function. Cannot recover original data. Used for verification.
- Encryption: Two-way function. Can decrypt with correct key. Used for confidentiality.
- Hash output: Fixed length regardless of input size.
- Encryption output: Length depends on input size.
Password Hashing Best Practices
Never use plain SHA-256 for passwords. Instead, use specialized password hashing functions:
- bcrypt: Built-in salt, adjustable work factor, widely supported.
- scrypt: Memory-hard, resistant to GPU attacks.
- Argon2: Winner of Password Hashing Competition, recommended for new projects.
Frequently Asked Questions
Can two different inputs produce the same hash?
Theoretically yes (called a collision), since hashes are fixed-length but inputs are unlimited. Secure algorithms make finding collisions computationally infeasible. MD5 and SHA-1 have known practical collisions; SHA-256 does not.
Why can't I reverse a hash to get the original text?
Hashes are one-way functions that lose information. Many different inputs can produce the same hash (collision), so there's no unique "original." The math is designed to make reversal computationally infeasible.
What is a "salt" in password hashing?
A salt is random data added to a password before hashing. It ensures identical passwords produce different hashes, defeating rainbow table attacks. Each user should have a unique salt stored alongside their hash.
How long does it take to crack a hash?
It depends on the algorithm and password complexity. A weak password like "123456" can be found instantly using rainbow tables. A strong random password with SHA-256 would take longer than the age of the universe to brute force.
Is SHA-256 the same as SHA-2?
SHA-2 is a family of algorithms including SHA-224, SHA-256, SHA-384, and SHA-512. SHA-256 is the most commonly used member. They all share the same underlying design but differ in output length.
Should I use SHA-512 instead of SHA-256?
For most applications, SHA-256 is sufficient and often faster on 32-bit systems. SHA-512 can be faster on 64-bit systems and offers a larger security margin. Both are currently considered secure.
