Aussie AI

Bloom Filters in AI

  • Last Updated 14 September, 2024
  • by David Spuler, Ph.D.

Bloom filters are a probabilistic data structure based on hashing and bitflags. Multiple hash functions are computed for each key, and this is used to set bitflags, as described in more detail below.

Like hashing, Bloom filters have been used as a data structure to speed up neural network inference. However, much of the research literature about Bloom filters is about a different topic: Weightless Neural Networks (WNNs). WNNs have a different type of neuron based on binary bits, rather than matrix multiplications. These bitflag neurons can be approximated using Bloom filters. As such, that part of the research is less relevant to optimization of the inference of recent neural networks such as Transformers, and has not been examined in detail below.

What are Bloom Filters?

Given a key, multiple hash functions are calculated for that key, and a binary flag is set in a bitflag table for each of those hash offsets. In this way, a key maps to a pattern of multiple bits.

The Bloom filter lookup for a key value works as follows: To test whether a key is found, the multiple hash functions are computed, and then the bitflag table is analyzed to see if all those bits are set. If any of the bits are missing, the key is not in the bloom filter. If all of the bits are found, the key is possibly in the bloom filter, but it may also be that other keys have coincidentally set all those bits (a "false positive"), so it is not 100% guaranteed to be present, and a second different type of backup data structure may need to be queried to confirm. Hence, the Bloom filter is a fast test to see if a key is not in a set, but a slow test if the key is found. This makes it an example of a "common case first" optimization, where fast computations may skip more involved computations.

The complexity of Bloom filters is constant, but not as fast as hashing. A hash filter uses only a single hash function, so it has O(1) lookup. However, a Bloom filter uses multiple functions, k, so it has O(k) lookup complexity.

Bloom Filters and AI Models

Some research below has used Bloom filters to speed up model inference (also true for simple hashing). Research on the use of Bloom filters with moderm types of neural networks includes:

Weightless Neural Networks (WNNs)

Weightless neural networks (WNNs) are a specific neural network architecture based on binary inputs and outputs. WNNs are often implemented using Bloom filters for optimization.

More AI Research

Read more about: