Properties of a Good Hash Function
A hash function is not just any formula. It must satisfy certain properties.
1 Uniform Distribution
The hash function should distribute keys evenly across the table. If most keys go to only 2 or 3 indices, then performance becomes poor.

Our goal is to spread keys like salt on food, not like a pile of sugar in one corner. Uniform distribution reduces collisions and improves performance.
.
2 Fast Computation
Hashing is used to make search fast. If the hash function itself takes too much time, then what is the point?
A good hash function must be
- Simple
- Quick to compute
- Efficient in real systems
Operations like modulo, multiplication is commonly used because they are fast.
.
A hash function must always give the same output for the same key.
If today h(52) = 2 and tomorrow h(52) = 7, then the system will break. Because when we insert the key 52, we store it at index 2. Later, when we try to search for 52, we again apply the hash function. If it now gives 7, the system will go to index 7 and will not find the key. It will wrongly conclude that the key does not exist. So, deterministic behaviour is mandatory.
.
Types of Hash Functions
Let us now study common hash function techniques used.
1. Division Method
Formula is h(k) = k mod m
This is the most common and simple method.
If m = 10 and k = 52
52 mod 10 = 2, this is stored at index 2.
If m is badly chosen (like power of 2 in some patterns), collisions increase. Modulo operation mainly depends on the lower bits of the number in binary. If the keys have similar lower bits, they will collide again and again.
2 mod 2 = 0
4 mod 2 = 0
6 mod 2 = 0
8 mod 2 = 0
3 mod 2 = 1
5 mod 2 = 1
7 mod 2 = 1
9 mod 2 = 1
Even numbers and odd numbers pile up and creates collision every time. This is pattern based collision. That is why prime numbers are preferred. Prime numbers break these patterns.
.
Choosing Table Size
This is where many students make mistakes.
Why Prime Numbers?
If we use h(k) = k mod m and m is a prime number, distribution improves.
If m = 10 and 18 mod 10 = 8. Also, 58 mod 10 = 8 then it will make a Collision.
If m = 7 and 52 mod 7 = 3. Also, 32 mod 7 = 4 then it will be no collision. Prime numbers reduce clustering patterns.

.
2. Multiplication Method
Formula is h(k) = floor(m × (kA mod 1)) where A is a constant between 0 and 1.
This method reduces dependency on table size and sometimes gives better distribution. It is slightly more complex but useful in practice.
.
3. Mid-square Method
Steps –
- Square the key
- Extract middle digits
- Use them as index
Example
k = 44
44² = 1936
Middle digits → 93
So, index becomes 93 (depending on table size). This method works well when keys have similar patterns.
.
4. Folding Method
Divide the key into parts and add them.
k = 123456
Split – 123 and 456
Add = 123 + 456 = 579
Then take mod with table size. It is useful for large numeric keys like phone numbers.
.
5. Digit Extraction Method
Select specific digits from the key.
k = 987654
Take 2nd and 4th digits → 8 and 6 so the Index = 86
This method works only if selected digits vary uniformly.
.
Load Factor
Load Factor = n / m
Where n = number of elements and m = table size

Load factor tells how full the table is. If load factor increases –
- Collision increases
- Performance degrades
In ideal hashing, Search, Insert, Delete → O(1)
But if load factor becomes too high, operations may degrade to O(n). That is why resizing and rehashing are important in real systems.
.
Effect of Load Factor on Performance
Low load factor → Less collision → Faster performance
High load factor → More collision → Slower performance
So, balance is necessary. Not too empty. Not too full. Just like a classroom. If 5 students sit in 50 seats, space is wasted.
If 100 students sit in 50 seats, collision happens. Hash table is also like that.
Test Yourself
Q1- If a hash function distributes 90% of keys into only 2 indices and the remaining 10% across other indices, which property is violated?
Deterministic nature
Fast computation
Uniform distribution
Load factor
Ans – (3)
Explanation – Uniform distribution means keys must spread evenly across the table. If most keys cluster in 2 indices, performance degrades.
Q2- Suppose m = 16 and keys are multiples of 4. Which issue is most likely?
Deterministic failure
Reduced load factor
Pattern-based collision
Hash function becomes random
Ans – (3)
Explanation – 16 is power of 2. Modulo operation depends on lower bits. Multiples of 4 share similar lower bits, causing repeated collisions.
Q3- Which statement is true?
Prime table size eliminates collision
Prime table size reduces clustering
Prime numbers increase load factor
Prime numbers make hashing O(1) always
Ans – (2)
Explanation – Prime size does not remove collision, but reduces pattern-based clustering.
Q4- If load factor becomes 3, what does it imply?
Table is half full
Table is exactly full
No collision possible
More elements than slots
Ans – (4)
Explanation – Load factor = n / m
If n = 150 and m = 50, load factor = 3. That means table is overloaded.
Q5- If a hash function gives different outputs for the same key at different times, what fails?
Uniform distribution
Deterministic property
Load factor
Prime property
Ans – (2)
Explanation – Same input must always produce same output. Otherwise search and deletion will fail.
Q6- In division method, if m is 10 and keys end with same digit, what happens?
Uniform distribution
Increased collision
Reduced load factor
Random mapping
Ans – (2)
Explanation – Keys – 10, 40, 20, 30 and m = 10.
Modulo 10 depends on last digit. Keys with same last digit map to same index.
Q7- Which scenario guarantees collision?
Load factor < 1
m is prime
n > m
Hash function is deterministic
Ans – (3)
Explanation – By Pigeonhole Principle, if number of keys exceeds number of slots, collision must occur.
Q8- If all keys are even and m = 2, what happens?
Perfect distribution
All keys map to index 0
All keys map to index 1
No collision
Ans – (2)
Explanation – Even mod 2 = 0. So complete clustering happens.
Q9- Why does high load factor degrade performance even if hash function is good?
Ans – Even a good hash function cannot prevent collision if table becomes overcrowded. When n increases while m remains fixed, average bucket size increases, leading to longer search time.
Q10- Explain mathematically why collision is unavoidable.
Ans – If number of keys exceeds number of slots, by Pigeonhole Principle at least two keys must share one slot. Therefore collision is mathematically guaranteed.
Q11- How does uniform distribution improve average time complexity?
Ans – If keys spread evenly, expected number of elements per bucket remains small. Hence search, insert, delete stay close to O(1).





