Sign in to save

Bookmark this page so you can find it later.

Sign in to save

Bookmark this page so you can find it later.

Hashing is a core computer science technique that turns data into a fixed-size value called a hash code or digest. Students use hashing to understand fast lookup in hash tables, secure data checking, and password storage. This cheat sheet helps connect the main ideas, formulas, and code patterns in one quick reference. It is useful for algorithms, data structures, cybersecurity, and programming practice.

Key Facts

  • A hash function maps an input key to an integer hash value using the rule hash = h(key).
  • A hash table stores a key at an array index often found by index = h(key) mod table_size.
  • The load factor of a hash table is alpha = number_of_items / table_size.
  • A collision happens when two different keys produce the same index, so the table must use chaining or open addressing.
  • With separate chaining, each table index stores a list or bucket of keys that hash to that same index.
  • With linear probing, a collision is handled by checking the next index using index = (h(key) + i) mod table_size.
  • Average-case search, insert, and delete in a well-sized hash table are O(1), but worst-case time can be O(n).
  • A cryptographic hash should be deterministic, fast to compute, one-way, collision resistant, and highly sensitive to small input changes.

Vocabulary

Hash Function
A procedure that converts an input key into a numeric value used for indexing, comparison, or verification.
Hash Table
A data structure that stores key-value pairs in an array using a hash function to choose where each key belongs.
Collision
A collision occurs when two different keys are assigned to the same hash table index or hash value.
Load Factor
The load factor is the ratio of stored items to table slots, calculated as alpha = number_of_items / table_size.
Chaining
Chaining is a collision-handling method where each table slot stores a list of all entries assigned to that index.
Salt
A salt is a random value added to a password before hashing to make identical passwords produce different stored hashes.

Common Mistakes to Avoid

  • Using hash = key mod table_size for every situation, because simple modulo hashing can create patterns and many collisions when keys are not well distributed.
  • Forgetting to handle collisions, because different keys can produce the same index even when the hash function is good.
  • Letting the load factor grow too high, because a crowded hash table causes more collisions and makes operations slower.
  • Confusing encryption with hashing, because encryption is meant to be reversible with a key while hashing is designed to be one-way.
  • Storing plain password hashes without salts, because attackers can compare hashes against precomputed tables and find common passwords more easily.

Practice Questions

  1. 1 A hash table has table_size = 10 and uses index = key mod 10. What index stores key 47?
  2. 2 A hash table stores 36 items in 60 slots. Calculate the load factor alpha = number_of_items / table_size.
  3. 3 Using linear probing with table_size = 7, key A hashes to index 3 but index 3 is full and index 4 is full. What index is checked next?
  4. 4 Explain why a cryptographic hash function should produce a very different digest when only one character of the input changes.