stashtable

StashTable is a space- and time-efficient generic concurrent hash table (also often named map or dictionary in other programming languages) that is a mapping from keys to values.

Thanks to it's unique collision resolution strategy, StashTable provides following key features:

  • O(1) amortized parallel reads and writes
  • O(1) worst-case serialized inserts
  • O(1) amortized serialized deletes
  • Fast, cache-friendly iterating that does not block and is not blocked by any other operation

API in a nutshell:

caveats:

  • Because iterating does not block, aggregate functions do not give consistent answers if other threads are modifying the table
  • withValue or withFound calls cannot be nested, unless always in same key order. Otherwise a deadlock is bound to occur
  • A blocking operation cannot be called from inside withValue or withFound. Otherwise a deadlock is bound to occur
  • Strings, seqs or refs in keys or values requires --gc:arc or --gc:orc

StashTable has ref semantics.

Basic usage

import stashtable
from sequtils import zip

let
  names = ["John", "Paul", "George", "Ringo"]
  years = [1940, 1942, 1943, 1940]
  beatles = newStashTable[string, int, names.len]()

for (name , birthYear) in zip(names, years):
  beatles[name] = birthYear

echo beatles
doAssert $beatles == """{"John": 1940, "Paul": 1942, "George": 1943, "Ringo": 1940}"""

let beatlesByYear = newStashTable[int, seq[string], names.len]()

for (birthYear , name) in zip(years, names):
  beatlesByYear.withValue(birthYear):
    value[].add(name)
  do:
    # key doesn't exist, we create one
    discard beatlesByYear.insert(birthYear, @[name])

echo beatlesByYear
doAssert $beatlesByYear == """{1940: @["John", "Ringo"], 1942: @["Paul"], 1943: @["George"]}"""

See also

sharedtables tables hashes nichecache

Types

Index = distinct int
Index to an item (item = key-value pair). Other threads may delete or move the item until it's locked with withFound template.
StashTable[K; V; Capacity] = ref StashTableObject[K, V, Capacity]
Generic thread-safe hash table.

Consts

NotInStash = -2147483647
Index for keys not in the table.

Procs

proc newStashTable[K; V; Capacity: static int](): StashTable[K, V, Capacity]

Creates a new hash table that is empty.

After maximum capacity is reached, inserts and upserts will start returning NotInStash. If you need to grow the capacity, create a new larger table, copy items to it with addAll proc and switch reference(s).

Recommended practice is to reserve enough capacity right at initialisation, because

  • StashTable is very space-efficient anyway (especially when using pointers as values)
  • addAll takes some time and blocked operations will naturally not be reflected in new table.
  • addAll will temporarily need enough memory to hold both old and new table
  • Last but not least: all extra capacity will speed up execution by lowering probability of hash collisions
proc `=destroy`[K, V, Capacity](stash: var StashTableObject[K, V, Capacity])
Releases OS resources reserved by locks.
proc `==`(x, y: Index): bool {...}{.borrow.}
proc `$`(index: Index): string {...}{.raises: [], tags: [].}
proc len(stash: StashTable): int {...}{.inline.}
Returns the number of keys in stash.
proc findIndex[K, V, Capacity](stash: StashTable[K, V, Capacity]; key: K): Index
Returns Index for given key, or NotInStash if key was not in stash. Note that the returned Index may be invalidated at any moment by other threads.
proc `$`(stash: StashTable): string
proc insert[K, V, Capacity](stash: StashTable[K, V, Capacity]; key: K; value: V): (
    Index, bool) {...}{.discardable.}
Inserts a (key, value) pair into stash. Returns a tuple with following information:
  • (i , true) : The item was succesfully inserted to Index i
  • (i , false) : The item was not inserted because there was already item with the same key at Index i
  • (NotInStash , false) : The item was not inserted because stash was full, i.e. len == Capacity

Example:

let a = newStashTable[char, int, 1024]()  
let (index1, wasinserted1) = a.insert('x', 7)
doAssert wasinserted1
let (index2, wasinserted2) = a.insert('x', 33)
doAssert index2 == index1 and not wasinserted2
proc upsert[K, V, Capacity](stash: StashTable[K, V, Capacity]; key: K; value: V): (
    Index, bool)
Updates value at stash[key] or inserts item if key not present. Returns a tuple with following information:
  • (i , true) : The item was succesfully inserted to Index i
  • (i , false) : The item already existed at Index i and it's value was updated to given value
  • (NotInStash , false) : The key did not exist but the item could not be inserted because stash was full
proc `[]=`[K, V, Capacity](stash: StashTable[K, V, Capacity]; key: K; value: V)
Executes upsert and silently discards the result.
proc addAll[K; V; Capacity1: static int; Capacity2: static int](
    toStashTable: StashTable[K, V, Capacity1];
    fromStashTable: StashTable[K, V, Capacity2]; upsert: bool): bool
Copies all items from fromStashTable to toStashTable. upsert parameter tells whether existing keys will be updated or skipped. Returns false if could not add all because capacity filled up.
proc del[K, V, Capacity](stash: StashTable[K, V, Capacity]; key: K)

Deletes key from stash. Does nothing if the key does not exist.

Note that del must not be called from inside a withValue or withFound block. Instead, write down the key and call del afterwards.

let s = newStashTable[int, char, 4]()
var deletables: seq[int]
s[1] = 'a' ; s[2] = 'b' ; s[3] = 'a'
doAssert s.len == 3
for (key, index) in s.keys:
  s.withFound(key, index):
    if value[] != 'b': deletables.add(key)
for key in deletables: s.del(key)
doAssert s.len == 1
proc clear[K, V, Capacity](stash: StashTable[K, V, Capacity])
Resets the stash so that it is empty.

Iterators

iterator keys[K, V, Capacity](stash: StashTable[K, V, Capacity]): (K, Index)

Iterates over all (key , index) pairs in the table stash.

feature: if there are no deletes, iteration order is insertion order.

let a = newStashTable[char, array[4, int], 128]()
a['o'] = [1, 5, 7, 9]
a['e'] = [2, 4, 6, 8]
for (k , index) in a.keys:
  a.withFound(k, index):
    echo "key: ", k
    echo "value: ", value[]
    doAssert (k == 'o' and value[] == [1, 5, 7, 9]) or (k == 'e' and value[] == [2, 4, 6, 8])  

Templates

template withFound[K; V; Capacity](stash: StashTable[K, V, Capacity]; thekey: K;
                                   theindex: Index; body: untyped)

Retrieves pointer to the value as variable value at theindex. thekey must also be given to ensure that the given Index is still holding the putative item.

Item is locked and value can be modified in the scope of withFound call.

On one hand this is faster operation than withValue because there's no need to find the Index, but on other hand there's greater probability that the item was deleted and reinserted to other Index by other thread(s) since the Index was retrieved.

Example:

type Item = tuple[name: char, uid: int]
let stash = newStashTable[int, Item, 8]()
let key = 42
stash[key] = ('a', 555)
let index = stash.findIndex(key)
stash.withFound(key , index):
  # block is executed only if ``key`` in ``index``
  value.name = 'u'
  value.uid = 1000
stash.withFound(42 , 0.Index): doAssert value.name == 'u'
template withValue[K; V; Capacity](stash: StashTable[K, V, Capacity]; thekey: K;
                                   body1, body2: untyped)

Retrieves pointer to the value as variable value for thekey. Item is locked and value can be modified in the scope of withValue call.

Example:

stashTable.withValue(key):
  # block is executed only if ``key`` in ``stashTable``
  value.name = username
  value.uid = 1000
do:
  # block is executed when ``key`` not in ``stashTable``
  raise newException(KeyError, "Key not found")
template withValue[K; V; Capacity](stash: StashTable[K, V, Capacity]; thekey: K;
                                   body: untyped)
Retrieves pointer to the value as variable value for thekey. Item is locked and value can be modified in the scope of withValue call.