Outdated egg!
This is an egg for CHICKEN 4, the unsupported old release. You're almost certainly looking for the CHICKEN 5 version of this egg, if it exists.
If it does not exist, there may be equivalent functionality provided by another egg; have a look at the egg index. Otherwise, please consider porting this egg to the current version of CHICKEN.
bloom-filter
TOC »
Documentation
Provides a simple Bloom Filter
Bloom Filter Object
make-bloom-filter
- make-bloom-filter M MDPS #!optional Kprocedure
Returns a bloom-filter object with M bits of discrimination and a set of hash functions built from the supplied MDPS, a (list-of message-digest-primitive) objects.
The number of hashes, K, is not necessarily the same as the number of message-digests. A hash (here) is defined as an unsigned 32 bit integer. Most message-digests return more 32 bits of hash. The actual length of the hash is divided into 32 bit blocks to get the individual hashes.
The argument K will restrict the actual number of hashes to the "first" K, no matter how many more the supplied message-digests create. First in the order of MDPS.
- make-bloom-filter P N MDPSprocedure
Returns a bloom-filter object with M and K values chosen for the given population capacity N and probablity of false-positives P.
Selecting the optimal set of message-digests is beyond the scope of make-bloom-filter.
bloom-filter-set!
- bloom-filter-set! BLOOM-FILTER OBJECTprocedure
Add the specified OBJECT to the indicated BLOOM-FILTER.
bloom-filter-exists?
- bloom-filter-exists? BLOOM-FILTER OBJECTprocedure
Is the specified OBJECT in the indicated BLOOM-FILTER.
bloom-filter?
check-bloom-filter
error-bloom-filter
- bloom-filter? OBJprocedure
- check-bloom-filter LOC OBJ #!optional NAMprocedure
- error-bloom-filter LOC OBJ #!optional NAMprocedure
bloom-filter-algorithms
- bloom-filter-algorithms BLOOM-FILTERprocedure
The mdps used for the filter.
bloom-filter-n
- bloom-filter-n BLOOM-FILTERprocedure
The current population - the number of objects added to the filter.
Not the population capacity.
bloom-filter-m
- bloom-filter-m BLOOM-FILTERprocedure
The number of bits of discrimination.
bloom-filter-k
- bloom-filter-k BLOOM-FILTERprocedure
The number of hashes. (See above.)
bloom-filter-p-false-positive
- bloom-filter-p-false-positive BLOOM-FILTER #!optional Nprocedure
The probability of a false-positive for the population capacity N, default is the current population, bloom-filter-n.
actual-k
- actual-k MDPSprocedure
Calculates the actual number of hashes for the MDPS.
optimum-size
- optimum-size P Nprocedure
Returns 2 values, an optimal M, bits of discrimination, and K, number of hashes, for the given population size N and probability of false-positives P.
desired-m
- desired-m P N #!optional Kprocedure
Calculates a near-optimal number of bits of discrimination to meet the desired probability of false positives P, with the given population size N and number of hashes K. When the K parameter is missing optimum-k is used to calculate a value.
A multi-valued return of the calculated M, K, and P values. The calculated probability may be lower than the desired. The calculated M value will always be a fixnum.
optimum-k
- optimum-k N Mprocedure
Optimal count of hashes for the given population size N and M bits of discrimination.
optimum-m
- optimum-m K Nprocedure
Optimal count of bits of discrimination for the given population size N and K number of hashes.
p-false-positive
- p-false-positive K N Mprocedure
What is the probability of false positives for the population size N assuming K hashes and M bits of discrimination.
p-random-one-bit
- p-random-one-bit K N Mprocedure
Calculates the probablility of a random set bit for the given number of hash functions K, population size N, and bits of discrimination M.
Usage
(require-extension bloom-filter)
References
Nice exposition of Bloom Filter False Positive Probability.
Requirements
moremacros iset message-digest record-variants check-errors
setup-helper sha1 md5 sha2 test
Author
Version history
- 1.2.2
- 1.2.1
- Fix bloom-filter-p-false-positive. Add types.
- 1.2.0
- Add missing API.
- 1.1.8
- Pick up the scraps.
- 1.1.7
- Re-flow.
- 1.1.6
- One more time.
- 1.1.5
- * UNMAINTAINED *
- 1.1.4
- Added optimum-size & make-bloom-filter variant. Calculations take the ceiling.
- 1.1.3
- A little faster (10%). Better fixnum overflow detection.
- 1.1.2
- Protect desired-m from fixnum representation overflow.
- 1.1.1
- "Fix" for call of non-procedure - maybe. (Nope.)
- 1.1.0
- A little faster (25%).
- 1.0.0
- From the Chicken 3 version, with some changes. (No message-digest registry, for example.)
License
Copyright (C) 2010-2018 Kon Lovett. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the Software), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED ASIS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.