Flexible key/value mapping.
nim
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
Iced Quinn e47cbb26b0
sign for gitea
3 weeks ago
src/icedmap support strings as keys 8 months ago
README.md doc: clarify license text 1 month ago
mkfile push target 7 months ago
nim.cfg standard mkfile 7 months ago

README.md

Iced Quinn’s Maps and Nimian Cartography

Unlike your typical stdlib map these maps come with the tuning values exposed. You provide the hash function to the map as well, so you can choose when you need a fast hash or a secure hash. You can also provide custom object hashes if you are using very strange keys that need special treatment.

Hopscotch Hashes

Hopscotch hashes appear whenever Cuckoo hashes are mentioned. Research has turned up that Robin Hood hashes are similar but perform better. Therefore Hopscotch has not been implemented here.

Robin Hood Hashes

Robin Hood maps are similar to the ubiquitous open address maps with linear probing. A single hash of an input value determines the object's native bucket. Like most open addresed linear probing maps, objects may be placed later in the array than their native bucket. This allows memory usage to remain denser than other maps (ex. Cuckoo maps.)

Like a Hopscotch map, Robin Hoods rely on objects belonging to neighborhoods. Each cell knows how far it is from its native cell. There is also an upper search limit, which places a hard bound on how much linear probing can degrade performance.

These tables will test at most 16 cells when attempting to find a value. Cells within the neighborhood will be checked for fullness, then checked to see if they belong to the same bucket as the item being located. These checks reduce the number of times keys must be checked; so most of those 16 comparisons will usually be reduced comparisons and not full key equality checks.

Cuckoo Hash

WARNING: Cuckoo hashes are probablistic. If more than two entries have the same hash result then things silently break down.

Cuckoo hash maps rely on two hash values for each entry. Items are placed at their first location if that is possible, or their second position if necessary. Because of this accesses are always O(1).

Hashes can be:

  • Two unrelated hash functions,
  • Two of the same hash functions with different salts, or,
  • Smaller pieces of one large key.

Rehashing is done if an item can not fit at either its first or second location.

Cycle protection is implemented to avoid infinite loops. It works by stopping insertion if we ever find outself trying to move the element just added. In that case insertion is aborted and a rehash is triggered.

Cuckoo maps are very sensitive to the hash function used along with them. High quality functions like SpookyHash 2 or MurmurHash 3 should be fine.

TODO: It should be possible to detect and alert on triple collisions.

Dependencies

For running the test suite:

  • icedhash.

License

  • Mozilla Public License, version 2 (MPL-2.)

Short version: you may use the modules for non-commercial and commercial purposes but any changes in the code must be returned to the repository.