Hash Table Load Factor and Capacity

This is an excerpt from the more extensive article on Hash Tables.

Load Factor

The load factor is the average number of key-value pairs per bucket.

load factor
=
total number of key-value pairs
number of buckets

It is when the load factor reaches a given limit that rehashing kicks in. Since rehashing increases the number of buckets, it reduces the load factor.

The load factor limit is usually configurable and offers a tradeoff between time and space costs.

Lower limit More empty buckets More memory overhead Higher limit Larger buckets Slower lookups

The default load factor for a Java HashMap is 0.75 and for a C# Hashtable it’s 1.0.

Capacity

The capacity is the maximum number of key-value pairs for the given load factor limit and current bucket count.

capacity
=
number of buckets
×
load factor limit

Since rehashing increases the number of buckets, it increases the capacity.

The default initial capacity for a Java HashMap is 12 and for a C# Hashtable it’s 0, i.e. the bucket array is initialized lazily upon first insertion.

Example

Here’s the structure of a hash table, configured with load factor limit of 4.

Current load factor: 24 / 8 = 3 Configured limit: 4 Current capacity: 8 × 4 = 32

Comments

Be the first to comment!