Configuration
Operations
Hash Table Visualization
Hash Function: h(key) = key % table_size
Help
What is a Hash Table?
- A hash table stores key-value pairs using a hash function to compute array indices
- Hash Function: Converts keys to array indices: h(key) = key % table_size
- Ideal Performance: O(1) average time for insert, search, and delete
- Collision: When two keys hash to the same index
- Hash tables power dictionaries, sets, caches, and database indexing
Collision Resolution Strategies
- Chaining: Each slot contains a linked list of all items that hash to that index
- Pros: Simple, no limit on items, never needs resizing
- Cons: Uses extra memory for pointers, cache unfriendly
- Linear Probing: If slot is full, try next slot sequentially
- Pros: Good cache locality, simple to implement
- Cons: Primary clustering - long runs of filled slots
- Quadratic Probing: Use quadratic function for probe sequence (i²)
- Pros: Reduces primary clustering
- Cons: Can miss empty slots, requires careful table size selection
Load Factor and Performance
- Load Factor: α = n/m (items/table_size)
- Chaining: Average chain length = α, performance degrades linearly with α
- Open addressing: Should keep α < 0.7 for good performance
- When α gets too high, resize table (typically double size and rehash all items)
- Prime table sizes help distribute keys more uniformly
Hash Function Design
- Good hash functions: Distribute keys uniformly across table
- Simple modulo (h(k) = k % m) works for integers with prime m
- For strings: combine character values with prime multipliers
- Cryptographic hashes (MD5, SHA) are overkill for hash tables
- Trade-off: Computation speed vs distribution quality
Operations Explained
- Insert: Hash key, handle collision if needed, store key-value pair
- Search: Hash key, follow collision resolution strategy to find key
- Delete: Hash key, find and remove, mark slot as deleted (for probing)
- Deleted slots in probing must be marked (not just emptied) to maintain search chains