One of the problems with large ontologies such as SNOMED Clinical Terms is that they’re, well, large. So, it’s not typically possible to hold all of the ontology in memory at once, and queries against it require a database lookup. It’s possible to eliminate a number of database accesses, and thus speed up the query process, by using a Bloom filter.
A Bloom filter is a memory-efficient probabilistic data structure that lets you test whether a particular item is a member of a set. It may return false positives, but not false negatives. So, by adding all of the terms in your ontology to a Bloom filter, you can do a fast, in-memory check to see whether an entered term definitely doesn’t exist in your ontology. If the Bloom filter reports that the term does exist, then you can confirm with a slower file or database query for that term.
In an application where you expect to encounter many terms that aren’t in the ontology, such as automated metadata extraction from documents, and automated document classification, then this can potentially lead to large performance improvements.