# Bloom Filter Calculator

Here's an interactive tool to help you tune the parameters for a Bloom filter. If you want to learn more about how Bloom filters work, see separate article here.

Number of bits (m)

m =

In bytes:

Number of hash functions (k)

k =

This in the optimal choice for n =

Not the optimal choice for any number of elements.

Acceptable false positive rate (p)

p =  %

Allows for up to elements.
Allows no elements to be added.

## Probability of false positives

Assuming uniform and independent hashing, the probability of a specific bit not being set to 1 after a single use of a hash function is

1
1

m

After inserting n elements using k hashes each, the probability of a specific bit still being 0 is thus

1
1

m

kn

Conversely, the probability of a specific bit being set to 1, is

1

1
1

m

kn

Since a false positive requires k random bits to be set to 1, the probability of a false positive after inserting n elements is

1

1
1

m

kn

k

Using the formula for calculating e

e

=

lim

x → ∞

1
1

x

x

…we can approximate the probability of false positives as

(

1 − e

kn / m

)

k

## Optimal value for k

By fixing the n / m ratio in the previous equation, we can solve for k, and find the optimal value for it. That is, if you for example know you can spare 1 MB for the bit vector, and you know that you will insert ~10,000 elements, you can calculate the number of hash functions that minimizes the false positives in subsequent lookups.

Given n and m we want to minimize

=
(

1 − e

kn / m

)

k

=

e

ln
(

1 − e

kn / m

)

k

=

e

k

ln
(

1 − e

kn / m

)

In other words, we want to minimize

k

ln
(

1 − e

kn / m

)

By simple rules of powers and logarithms, we know that k = −m / n ln(ekn / m), so we can rewrite the above as follows:

−

m

n

ln
(

e

kn / m

)
ln
(

1 − e

kn / m

)

By symmetry, we see that this is minimized when ekn / m = 1 − ekn / m, i.e. when ekn / m = 0.5. Intuitively this means that we should chose a k such that after insterting n elements, half of the m bits are set to true. Solving for k gives us

k

=

m ln 2

n

With this value for k the optimal false positive rate is approximately 0.6185m / n.