|
|
|
|
||
|
DEFINITION MODULE HashTables; (******************************************************************* Module HashTables (Version 1.0) Copyright (c) 1997-2006 by Andreas Fischlin and ETH Zurich. Purpose Provides hash tables. Remarks Any number of hash tables, each up to the maximum size of 2999 (a prime number) keys can be stored. Initially only small hash tables are allocated. Only if demand requires it, a bigger table is allocated in a so-called rehashing algorithm. The data which can be stored in the internal hash table are only addresses (information hiding). Keys can also be grouped into so-called series, ordered according the sequence of storing (StoreKey). A series is accessed via its first key. Reference: Wirth, N. (1986). Algorithms and Data Structures. Programming o Design Andreas Fischlin 30/01/1997 o Implementation Andreas Fischlin 30/01/1997 ETH Zurich Systems Ecology CHN E 35.1 Universitaetstrasse 16 8092 Zurich SWITZERLAND URLs: <mailto:RAMSES@env.ethz.ch> <http://www.sysecol.ethz.ch> <http://www.sysecol.ethz.ch/SimSoftware/RAMSES> Last revision of definition: 07/03/1997 AF *******************************************************************) FROM SYSTEM IMPORT ADDRESS; FROM Errors IMPORT userBase; CONST hashTableOverFlow = userBase + 250; unknownHashTable = userBase + 251; unStoredKey = userBase + 252; TYPE HashTable; Key = INTEGER; UserEntry = ADDRESS; DoWithUserEntryProc = PROCEDURE (Key, UserEntry); VAR nonExistentHashTable: HashTable; (* read only *) unknownEntry: UserEntry; (* read only *) resCode: INTEGER; (* Holds the last result code of any operation. This variable is offered as read only for efficiency reasons separate from other routines (except for StoreEntry) in order to better support efficiency, the main motivation to use hash tables. If no errors are encountered, resCode = allOk (see module Errors).*) PROCEDURE CreateHashTable(VAR ht: HashTable); (* Creates a new hash table ht *) PROCEDURE HashTableExists(ht: HashTable): BOOLEAN; PROCEDURE StoreKey(ht: HashTable; newSeries: BOOLEAN; key: Key); (* Stores key k in the hash table. If newSeries is TRUE, a new series is started. Subsequent calls to StoreKey with newSeries = FALSE will internally connect all stored entries into a series until StoreKey is called again with newSeries = TRUE. The series is denoted by the first key which was passed while newSeries has been TRUE. To ignore the series mechanism, simply set newSeries = TRUE always. If the capacity of the hash table is exceeded, resCode contains hashTableOverFlow. IMPORTANT NOTE: It is recommended to inspect resCode after each call to StoreKey if you store many entries close to the maximum of 2999 (since Hash may fail unnoticed otherwise). *) PROCEDURE DeleteKey(ht: HashTable; key: Key); (* Removes an individual key for subsequent reuse, i.e. reverts the effect of StoreKey. You should discard of associated user data (see below StoreEntry) before calling DeleteKey. *) PROCEDURE Hash(ht: HashTable; k: Key; VAR found: BOOLEAN): INTEGER; (* Returns for the hash table ht the hash function value mapping the key k. The hash function adjusts itself internally and allocates more memory until the upper limit of the hash table ([0..2999 = prime) is reached. If found is TRUE the key k has previously been successfully stored. IMPLEMENTATION RESTRICTION: Note, you should NOT use this function before having stored ALL(!) keys, since you do not learn about the rehashing. However, attached entries are correctly rehashed (see below procedure StoreEntry), supporting full dynamic use of hashing user data kept in heap only. In the latter case the behavior is similar to an ARRAY OF UserEntry, but still offers the advantage of mapping a large range of index values (keys) to a much smaller, still storable range (the classical principle of hashing). NOTE ALSO: For efficiency reasons Hash assumes ht references an existing and fully valid HashTable if ht's value differs from nonExistentHashTable. Returns unknownEntry if ht = nonExistentHashTable. NOTE, however, Hash does NOT generate an error message and will return a WRONG value in case of hashTableOverFlow!! Thus, hashTableOverFlow should be checked while storing entries (see also StoreKey). *) (* The following procedures support hashing in conjunction with some user data kept in heap (or alternatively referenced by SYSTEM.ADR(userEntry). It is possible to associate userEntries in a 1:1 relation, referencable via an integer key. *) PROCEDURE StoreEntry(ht: HashTable; k: Key; e: UserEntry); (* Attaches and stores entry e in the internal hash table denoted by ht for key k, given k has been previously successfully stored with StoreKey. Typically this mechanism is used to reference some client data structures via e. This emulates a data structure of the type ARRAY [Hash(ht,key)] OF UserEntry. Delete an entry by simply calling StoreEntry(ht,k,unknownEntry). *) PROCEDURE Entry(ht: HashTable; key: Key): UserEntry; (* returns the entry for key. Returns unknownEntry if never stored. IMPLEMENTATION RESTRICTION: For efficiency reasons Entry assumes that ht references an existing and fully valid HashTable, in case ht has a value different from nonExistentHashTable. *) (* The following procedures support hashing operating on a group of keys, a so-called series *) PROCEDURE DoInSeries(ht: HashTable; fstkey: Key; dwuse: DoWithUserEntryProc); (* Executes dwuse for every entry stored in the series as denoted by fstkey *) PROCEDURE DeleteSeries(ht: HashTable; fstkey: Key; dwuse: DoWithUserEntryProc); (* Deletes all entries belonging to the series denoted by fstkey and calls dwuse for every entry in the series. IMPLEMENTATION RESTRICTION: For efficiency reasons, the entire series is cleared before the Hash function is fully restored. Hence, do not use any procedures relying on the Hash function from within dwuse, i.e. Hash, StoreKey, DeleteKey, StoreEntry, DeleteEntry, and Entry. Of course, you should also refrain from calling DiscardHashTable for the table in which you delete the series. You should only operate on your user entry. If you wish to call any of above routines from within dwuse, use DoInseries and call DeleteKey from within dwuse. That's a safe operation preserving the functionality of the Hash function at all times. *) PROCEDURE DiscardHashTable(VAR ht: HashTable); END HashTables.
|
||
|
|
|