- Iterator API
- Version history
Hash Array Mapped Tries (HAMT)
(Phil Bagwell, `Ideal Hash Trees', Technical Report, 2001)
Provided that the hash function have no collisions and yield hash values in a bounded interval, hash tries admit search and update in constant worst-case time (!), bounded by a somewhat larger constant than what one usually finds for the worst-case time of search or replacement, and the amortized time of insertion or deletion, in hash tables. There is no complicated incremental rehashing going on like in some real-time hash tables to attain these constant time bounds; in fact, there is never any rehashing. A little more precisely, `constant' means logarithmically proportional to the length of the interval of the hash values, or, practically, linearly proportional to the number of bits in the hash, with small constant factors.
Yes, this is the same data structure as Clojure uses to implement its hash maps and hash sets. Similarly to Clojure, this code uses hash collision buckets rather than the paper's suggestion of repeating hashes salted by the trie depth.
Although the pronunciation is identical, and despite the title of Bagwell's paper, a hash trie is not a hash tree. Sorry. Nor do these hash tries have any relation to what Knuth calls hash tries.
Except where this is obviously not the case (hash-table/fold, for example), every procedure here runs in constant expected time under the assumption of a good key hash function, ignoring the time taken by garbage collection and the time taken by the key hash function and the key equality predicate of the relevant hash trie type. When searching in a hash trie, the key equality predicate is applied to only as many keys extra as share a common hash value with the key whose association is sought. Thus in a hash trie with no collisions, every search involves at most one invocation of the key hash function, and at most one invocation of the key equality predicate.
- make-hash-trie-type KEY=? KEY-HASHprocedure
Constructor for hash trie types. KEY=? must be a key equality predicate, a procedure of two arguments that returns true to indicate that they are equal and false to indicate that they are not, and that behaves transitively, symmetrically, and reflexively. KEY-HASH must be a key hash function that preserves the key equality predicate, i.e. for keys A and B, it must be that if (KEY=? A B), then (= (KEY-HASH A) (KEY-HASH B)).
- hash-trie-type? OBJECTprocedure
Disjoint type predicate for hash trie types.
- hash-trie-type/key-equality-predicate HASH-TRIE-TYPEprocedure
- hash-trie-type/key-hash-function HASH-TRIE-TYPEprocedure
Accessors for the key equality predicates and key hash functions of hash trie types.
- make-hash-trie HASH-TRIE-TYPEprocedure
Hash trie constructor.
- hash-trie/type HASH-TRIEprocedure
Returns the <hash-trie-type> of HASH-TRIE.
- hash-trie/count HASH-TRIEprocedure
Returns the number of associations in HASH-TRIE.
- hash-trie/empty? HASH-TRIEprocedure
Returns true if HASH-TRIE has no associations, or false if it has any.
- hash-trie/search HASH-TRIE KEY IF-FOUND IF-NOT-FOUNDprocedure
Searches for an association for KEY in HASH-TRIE. If there is one, tail-calls IF-FOUND with one argument, the datum associated with key. If not, tail-calls IF-NOT-FOUND with zero arguments.
- hash-trie/lookup HASH-TRIE KEY DEFAULTprocedure
Searches for an association for KEY in HASH-TRIE. If there is one, returns its associated datum; otherwise returns DEFAULT.
- hash-trie/member? HASH-TRIE KEYprocedure
Returns true if HASH-TRIE has an association for KEY, or false if not.
- hash-trie/update HASH-TRIE KEY IF-FOUND IF-NOT-FOUNDprocedure
Searches for an association for KEY in HASH-TRIE. If there is one, tail-calls IF-FOUND with three arguments:
- the associated datum
- a procedure (REPLACE DATUM) that returns a new hash trie with all the associations in HASH-TRIE, but with DATUM substituted for the datum associated with KEY
- a procedure (DELETE) that returns a new hash trie with all the associations in HASH-TRIE excluding the association for KEY
If there is no such association, tail-calls IF-NOT-FOUND with one argument, a procedure (INSERT DATUM) that returns a new hash trie with all the associations in HASH-TRIE as well as an association of DATUM with KEY.
- hash-trie/insert HASH-TRIE KEY DATUMprocedure
Returns a hash trie with all the associations in HASH-TRIE, but associating DATUM with KEY, whether HASH-TRIE had an association for KEY or not.
- hash-trie/modify HASH-TRIE KEY DEFAULT MODIFIERprocedure
Returns a hash trie with all the associations in HASH-TRIE, but associating (MODIFIER D) with KEY if HASH-TRIE associated a datum D with KEY, or associating (MODIFIER DEFAULT) with KEY if HASH-TRIE had no association for KEY.
- hash-trie/intern HASH-TRIE KEY GENERATORprocedure
If HASH-TRIE has an association for KEY, returns its associated datum and HASH-TRIE. Otherwise, calls (GENERATOR KEY) to obtain a datum D, and returns D and a hash trie with all the associations in HASH-TRIE as well as an association of D with KEY.
- hash-trie/delete HASH-TRIE KEYprocedure
Returns a hash trie with all the associations in HASH-TRIE, excluding its association, if any, for KEY.
- hash-trie/fold HASH-TRIE INITIAL-VALUE COMBINATORprocedure
Folds HASH-TRIE by COMBINATOR, starting with an initial value V of INITIAL-VALUE and updating it for each association of a datum D with a key K, in no particular order, by (COMBINATOR K D V).
- hash-trie->alist HASH-TRIEprocedure
- hash-trie/key->alist HASH-TRIEprocedure
- hash-trie/datum->alist HASH-TRIEprocedure
hash-trie->alist returns a list of pairs, in no particular order, corresponding with the associations in HASH-TRIE, with keys in the cars and associated data in the respective cdrs.
hash-trie/key-list returns a list of all the keys in HASH-TRIE, in no particular order.
hash-trie/datum-list returns a list of all the data in HASH-TRIE, in no particular order.
- alist->hash-trie ALIST HASH-TRIE-TYPEprocedure
Returns a hash trie of the given type with the associations listed in ALIST, taking keys from the cars and corresponding data from the respective cdrs.
- string-hash STRINGprocedure
- symbol-hash SYMBOLprocedure
- exact-integer-hash EXACT-INTEGERprocedure
- real-number-hash REAL-NUMBERprocedure
- complex-number-hash COMPLEX-NUMBERprocedure
Hash functions for various types of data. exact-integer-hash, real-number-hash, and complex-number-hash all agree where their domains coincide. The current implementations of these hash functions are all based on the FNV (Fowler-Noll-Vo) family of hash functions, tweaked slightly so that it is more likely to fit into the range of fixnums (small exact integers that can be represented immediately) for most Scheme systems on 32-bit machines.
<hash-trie-type> of the above hash functions, with appropriate key equality predicates: string=? for strings, eq? for symbols, and = for the numeric types.
- hash-trie-root HASH-TRIEprocedure
- hash-trie-bucket? HASH-TRIE-NODEprocedure
- hash-trie-bucket-list HASH-TRIE-BUCKET-NODEprocedure
- hash-trie-branch? HASH-TRIE-NODEprocedure
- hash-trie-branch-count HASH-TRIE-BRANCH-NODEprocedure
- HASH-TRIE-NODE : (or HASH-TRIE-BUCKET-NODE HASH-TRIE-BRANCH-NODE)
- vector ; hash-trie-branch-count returns last index
(import hash-trie) (import (srfi 41)) ;see https://mumble.net/~campbell/scheme/hash-trie.scm (define (hash-trie->stream hash-trie) (define (bucket->stream list tail) (if (pair? list) (stream-cons (car list) ((stream-lambda () (bucket->stream (cdr list) tail)))) tail ) ) (define (branch->stream branch tail) (let ((vec (hash-trie-branch-vector branch))) (let recur ((index (sub1 (vector-length vec))) (tail tail)) (if (positive? index) (node->stream (vector-ref branch index) ((stream-lambda () (recur (sub1 index) tail)))) tail ) ) ) ) (define (node->stream node tail) (cond ((hash-trie-branch? node) (branch->stream node tail)) ((hash-trie-bucket? node) (bucket->stream (hash-trie-bucket-list node) tail)) (else (error 'hash-trie->stream "(internal) invalid hash-trie node" node hash-trie)) ) ) (let ((root (hash-trie-root hash-trie))) (if root ((stream-lambda () (node->stream root stream-null))) stream-null ) ) ) (define (stream->hash-trie stream hash-trie-type) (let loop ((stream stream) (hash-trie (make-hash-trie hash-trie-type))) (if (stream-pair? stream) (let ((cell (stream-car stream))) (loop (stream-cdr stream) (hash-trie/insert hash-trie (car cell) (cdr cell))) ) hash-trie ) ) )
Taylor R. Campbell
- Provide Iterator API.
- C5 release.
Copyright (c) 2009, Taylor R. Campbell
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
- Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
- Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
- Neither the names of the authors nor the names of contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE AUTHORS ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.