Basic asynchronous hashmap with Miniasync library
Miniasync library provides a framework for the composition and execution of asynchronous tasks in C. To accommodate different user-defined tasks and various types of data that they take in, libminiasync makes use of macros.
Using libminiasync for the first time can be challenging. There are multiple examples on the miniasync repo to make it easier. One of them is a hashmap example.
Hashmap example overview
The hashmap example on the Miniasync repository presents a hashmap with a fixed size that allocates memory upon key-value pair insertion. This particular implementation uses linear probing for hashmap entry lookup. Linear probing means that whenever we encounter a collision when inserting a key-value pair, we will move on to the next hashmap entry until we find an unoccupied entry. For the hash function, we use the Austin Appleby MurmurHash3 64-bit finalizer.
Each operation associated with hashmap entry insertion, deletion and search is implemented using Miniasync future API.
Creating a future
Future is an abstraction of an asynchronous task. Each future has to be defined using
FUTURE(_name, _data_type, _output_type)
macro. It takes future name _name
,
input data type _data_type
and output data type _output_type
. For more information
about the futures, see future manpage on Miniasync github
repository.
The lookup
future definition from the hashmap example:
struct hashmap_lookup_data {
struct hashmap *hm;
uint64_t key;
enum hashmap_entry_state state;
};
struct hashmap_lookup_output {
struct hashmap_entry *hme;
};
FUTURE(hashmap_lookup_fut, struct hashmap_lookup_data,
struct hashmap_lookup_output);
The lookup data struct hashmap_lookup_data
contains a pointer to hashmap hm
, key
, and state
of the
searched entry. hm
should point to an actual hashmap instance, key
is an entry key used in the hash function,
state
indicates whether the lookup function should search for an existing or unoccupied hashmap entry.
The output data struct hashmap_lookup_output
contains a pointer to the hashmap entry hme
that points
to the found hashmap entry.
FUTURE(...)
macro defines a struct hashmap_lookup_fut
future struct with struct hashmap_lookup_data
as input data and struct hashmap_lookup_output
as output.
Future task function
typedef enum future_state (*future_task_fn)(struct future_context *context,
struct future_notifier *notifier);
The future task function contains the actual logic of the future. Every such task implementation must conform to the definition above. Future context provides access to the input and output data of a particular future.
Notifier feature is not used in the hashmap example, hence it will not be discussed. For more information about the notifiers take a look here.
The return value of future task function indicates the state of the future task.
enum future_state {
FUTURE_STATE_IDLE,
FUTURE_STATE_COMPLETE,
FUTURE_STATE_RUNNING,
};
FUTURE_STATE_IDLE
indicates that the future task did not start yet, FUTURE_STATE_COMPLETE
means that it has already completed, and FUTURE_STATE_RUNNING
signalizes that it’s still running.
The lookup
future task function:
static enum future_state
hashmap_lookup_impl(struct future_context *ctx,
struct future_notifier *notifier)
{
struct hashmap_lookup_data *data =
future_context_get_data(ctx);
struct hashmap_lookup_output *output =
future_context_get_output(ctx);
struct hashmap *hm = data->hm;
uint64_t key = data->key;
enum hashmap_entry_state state = data->state;
struct hashmap_entry *hme = NULL;
if (key == 0) {
printf("invalid key %" PRIu64 "\n", key);
goto set_output;
} else if (state == HASHMAP_ENTRY_STATE_UNOCCUPIED &&
hm->capacity == hm->length) {
printf("no space left for key %" PRIu64 "\n", key);
goto set_output;
} else if (state == HASHMAP_ENTRY_STATE_UNOCCUPIED &&
hashmap_entry_lookup(hm, key,
HASHMAP_ENTRY_STATE_PRESENT) != -1) {
printf("key %" PRIu64 " already exists\n", key);
goto set_output;
}
int index = hashmap_entry_lookup(hm, key, state);
if (index == -1) {
switch (state) {
case HASHMAP_ENTRY_STATE_PRESENT:
/* Entry with given key is not present in the hashmap */
goto set_output;
case HASHMAP_ENTRY_STATE_UNOCCUPIED:
/*
* An unoccupied entry wasn't found despite hashmap not
* being full. Re-run the lookup future.
*/
return FUTURE_STATE_RUNNING;
default:
assert(0); /* should not be reachable */
}
}
hme = &hm->entries[index];
set_output:
output->hme = hme;
return FUTURE_STATE_COMPLETE;
}
Let’s break down the above code snippet.
struct hashmap_lookup_data *data =
future_context_get_data(ctx);
struct hashmap_lookup_output *output =
future_context_get_output(ctx);
struct hashmap *hm = data->hm;
uint64_t key = data->key;
enum hashmap_entry_state state = data->state;
We get the pointers to the data and output structures through the ctx
parameter
using future_context_get_data(...)
and future_context_get_output(...)
functions.
Data and output types are the same types provided to the FUTURE(...)
macro.
if (key == 0) {
printf("invalid key %" PRIu64 "\n", key);
goto set_output;
} else if (state == HASHMAP_ENTRY_STATE_UNOCCUPIED &&
hm->capacity == hm->length) {
printf("no space left for key %" PRIu64 "\n", key);
goto set_output;
} else if (state == HASHMAP_ENTRY_STATE_UNOCCUPIED &&
hashmap_entry_lookup(hm, key,
HASHMAP_ENTRY_STATE_PRESENT) != -1) {
printf("key %" PRIu64 " already exists\n", key);
goto set_output;
}
Then, we perform some checks on the obtained input data. In case one of the checks
failed, we set the output entry address to NULL
and return FUTURE_STATE_COMPLETE
.
int index = hashmap_entry_lookup(hm, key, state);
Next, we use the obtained input data to find the index of the hashmap entry.
if (index == -1) {
switch (state) {
case HASHMAP_ENTRY_STATE_PRESENT:
/* Entry with given key is not present in the hashmap */
goto set_output;
case HASHMAP_ENTRY_STATE_UNOCCUPIED:
/*
* An unoccupied entry wasn't found despite hashmap not
* being full. Re-run the lookup future.
*/
return FUTURE_STATE_RUNNING;
default:
assert(0); /* should not be reachable */
}
}
If searching for the index of an existing entry failed, we proceed similarly to the failed check case.
If we were looking for the index of an unoccupied entry and nothing was found, we return
FUTURE_STATE_RUNNING
. hashmap_lookup_impl
will be re-run upon the next poll,
preserving all the changes made to the data and output.
hme = &hm->entries[index];
set_output:
output->hme = hme;
return FUTURE_STATE_COMPLETE;
Lastly, when we found an entry, we set the output entry pointer to its address and
return FUTURE_STATE_COMPLETE
.
Initializing a future
Having defined the future input, output and task function, we can create an instance of
the lookup
future.
static struct hashmap_lookup_fut
hashmap_lookup(struct hashmap *hm, uint64_t key, enum hashmap_entry_state state)
{
struct hashmap_lookup_fut future;
/* Set input values */
future.data.hm = hm;
future.data.key = key;
future.data.state = state;
/* Set default output value */
future.output.hme = NULL;
FUTURE_INIT(&future, hashmap_lookup_impl);
return future;
}
In the above code fragment, we create a struct hashmap_lookup_fut
structure and set its input data and
default output values. After that, we initialize this future and bind it with a hashmap_lookup_impl(...)
task function using the FUTURE_INIT(...)
macro. After those operations, struct hashmap_lookup_fut
is
ready to be polled.
Creating a chained future
Multiple futures can be composed into a single chained future. When polled, chained
future executes its future entries sequentially in the order they were defined.
We can define a future as the chained future entry using FUTURE_CHAIN_ENTRY(_future_type, _name)
macro. It takes two parameters, _future_type
is the type of the future, and _name
is the name of the variable stored in the data structure.
struct hashmap_lookup_lock_entry_data {
FUTURE_CHAIN_ENTRY(struct hashmap_lookup_fut, lookup);
FUTURE_CHAIN_ENTRY(struct hashmap_entry_set_state_fut, set_state);
FUTURE_CHAIN_ENTRY_LAST(struct chain_entries_rerun_fut, entries_rerun);
struct future_chain_entry *entriesp[2];
};
struct hashmap_lookup_lock_entry_output {
struct hashmap_entry *hme;
};
FUTURE(hashmap_lookup_lock_entry_fut, struct hashmap_lookup_lock_entry_data,
struct hashmap_lookup_lock_entry_output);
Here, struct hashmap_lookup_lock_entry_fut
future consists of three other futures,
struct hashmap_lookup_fut
that was discussed, struct hashmap_entry_set_state_fut
,
and ‘struct chain_entries_rerun_fut’ that follow the same principles the lookup
future does.
struct hashmap_lookup_lock_entry_fut
is a chained future responsible for finding an appropriate
hashmap entry and locking it in the HASHMAP_ENTRY_STATE_LOCKED
state. Hashmap entry locking ensures
that no other future will interact with this hashmap entry until we finish processing it and change its state.
If we would like to store some additional data in the struct hashmap_lookup_lock_entry_data
structure, besides the future entries, we can use the FUTURE_CHAIN_ENTRY_LAST(_future_type, _name)
macro.
It works as the FUTURE_CHAIN_ENTRY(...)
macro, but we should use it to define the last future entry.
Then, we can safely declare some additional data after it. For example, struct hashmap_lookup_lock_entry_data
stores entriesp
, an array of pointers to struct future_chain_entry
. We use entriesp
to save the chained future
entries that should be re-run. Then, we employ the chain_entry_rerun_fut
future to re-run those future entries.
We should declare additional chained future data only after using the
FUTURE_CHAIN_ENTRY_LAST(...)
macro.
Initializing a chained future
Chained futures are initialized quite differently from regular futures. Each member future
must be initialized using either FUTURE_CHAIN_ENTRY_INIT(_entry, _fut, _map, _map_arg)
or FUTURE_CHAIN_ENTRY_LAZY_INIT(_entry, _init, _init_arg, _map, _map_arg)
macro.
We will focus on FUTURE_CHAIN_ENTRY_INIT(...)
macro.
FUTURE_CHAIN_ENTRY_INIT(...)
takes four parameters. _entry
is a pointer to the
future entry. _fut
is an initialized future structure. _map
defines the mapping behavior
of the data between the future currently being initialized and the next future in the chain.
Mapping function of the last future entry should map the data between this entry and the chained
future containing it. _map_arg
is the mapping function argument.
static struct hashmap_lookup_lock_entry_fut
hashmap_lookup_lock_entry(struct hashmap *hm, uint64_t key,
enum hashmap_entry_state state)
{
struct hashmap_lookup_lock_entry_fut chain;
/* Initialize chained future entries */
FUTURE_CHAIN_ENTRY_INIT(&chain.data.lookup,
hashmap_lookup(hm, key, state),
lookup_to_set_state_map, NULL);
FUTURE_CHAIN_ENTRY_INIT(&chain.data.set_state,
hashmap_entry_set_state(NULL, state,
HASHMAP_ENTRY_STATE_LOCKED),
NULL, NULL);
FUTURE_CHAIN_ENTRY_LAZY_INIT(&chain.data.entries_rerun,
chain_entry_rerun_init, NULL, NULL, NULL);
/* Set default chained future output value */
chain.output.hme = NULL;
FUTURE_CHAIN_INIT(&chain);
return chain;
}
In this code fragment we create and initialize a struct hashmap_lookup_lock_entry_fut
future structure and its future entries.
FUTURE_CHAIN_ENTRY_INIT(&chain.data.lookup,
hashmap_lookup(hm, key, state),
lookup_to_set_state_map, NULL);
We initialize the lookup
future entry using FUTURE_CHAIN_ENTRY_INIT(...)
macro.
&chain.data.lookup
is an address of the lookup
future entry. hashmap_lookup(hm, key, state)
returns a new, initialized struct hashmap_lookup_fut
future. lookup_to_set_state_map
maps the
data from lookup
future to the set_state
future.
static void
lookup_to_set_state_map(struct future_context *lookup_ctx,
struct future_context *set_state_ctx, void *arg)
{
struct hashmap_lookup_output *lookup_unoccupied_output =
future_context_get_output(lookup_ctx);
struct hashmap_entry_set_state_data *set_state_data =
future_context_get_data(set_state_ctx);
struct hashmap_entry *hme = lookup_unoccupied_output->hme;
if (hme == NULL) {
/*
* Entry lookup failed, no need to lock the entry in
* 'locked' state.
*/
set_state_ctx->state = FUTURE_STATE_COMPLETE;
}
set_state_data->hme = hme;
}
In the lookup_to_set_state_map
mapping function we access the data of lookup
and set_state
futures through their contexts. Then, we check if the lookup
future has found a
hashmap entry. If lookup
future didn’t find an entry, we mark the set_state
future as completed.
set_state
will not be executed when it’s marked as completed.
Lastly, we set the lookup
output entry in the set_state
input data.
We initialize the set_state
, and ’entries_rerun’ futures entries similarly to the lookup
future
entry.
FUTURE_CHAIN_INIT(&chain);
After initializing each chained future entry, we initialize the chained future by passing
its address to the FUTURE_CHAIN_INIT(...)
macro. hashmap_lookup_lock_entry_fut
future
is now ready to be polled.
Initializing chained future entries lazily
We can also initialize chained future entries lazily using
FUTURE_CHAIN_ENTRY_LAZY_INIT(_entry, _init, _init_arg, _map, _map_arg)
macro.
_entry
, _map
, and _map_arg
parameters are the same as in the FUTURE_CHAIN_ENTRY_INIT(...)
macro.
_init
is an entry initialization function that has to conform to the
typedef void (*future_init_fn)(void *future, struct future_context *chain_fut, void *arg)
definition,
_init_arg
is the initialization function argument.
Lazily initialized future entry is initialized right before its execution.
We can find an example of a lazy entry initialization in the hashmap_get_copy_fut
future.
static struct hashmap_get_copy_fut
hashmap_get_copy(struct vdm *vdm, struct hashmap *hm, uint64_t key)
{
struct hashmap_get_copy_fut chain;
/* Initialize chained future entries */
FUTURE_CHAIN_ENTRY_INIT(&chain.data.lookup_lock_entry,
hashmap_lookup_lock_entry(hm, key,
HASHMAP_ENTRY_STATE_PRESENT), NULL,
NULL);
FUTURE_CHAIN_ENTRY_LAZY_INIT(&chain.data.memcpy_value,
memcpy_value_init, vdm, NULL, NULL);
FUTURE_CHAIN_ENTRY_LAZY_INIT(&chain.data.set_state,
set_state_init_for_get, NULL, NULL, NULL);
/* Set default output values */
chain.output.size = 0;
chain.output.value = NULL;
FUTURE_CHAIN_INIT(&chain);
return chain;
}
hashmap_get_copy(...)
looks similar to the already discussed hashmap_lookup_lock_entry(...)
function, except we initialize some of its future entries lazily. We will focus
on the set_state
entry.
FUTURE_CHAIN_ENTRY_LAZY_INIT(&chain.data.set_state,
set_state_init_for_get, NULL, NULL, NULL);
Previously we have used hashmap_entry_set_state(...)
to initialize the set_state
entry.
Now, we pass along an address of the set_state_init_for_get(...)
function that is responsible
for initializaing the set_state
entry.
static void
set_state_init_for_get(void *future,
struct future_context *hashmap_get_copy_ctx, void *arg)
{
struct hashmap_get_copy_data *data =
future_context_get_data(hashmap_get_copy_ctx);
struct hashmap_entry_set_state_fut fut = {.output.changed = 0};
struct hashmap_entry *hme = data->lookup_lock_entry.fut.output.hme;
if (hme == NULL) {
/*
* 'lookup_lock_entry' future entry failed to find a hashmap
* entry. 'set_state' future entry shouldn't be executed.
*/
FUTURE_INIT_COMPLETE(&fut);
} else {
/*
* 'lookup_lock_entry' was successful.
* Set hashmap entry state to 'present'.
*/
fut = hashmap_entry_set_state(hme,
HASHMAP_ENTRY_STATE_LOCKED,
HASHMAP_ENTRY_STATE_PRESENT);
}
memcpy(future, &fut, sizeof(fut));
}
In the set_state_init_for_get
function, we get the output from the lookup_lock_entry
future entry and check if we found and locked a valid hashmap entry. We want to omit the
set_state
future entry execution when lookup_lock_entry
output is invalid. In that case, we
use the FUTURE_INIT_COMPLETE(_futurep)
macro to mark the set_state
future as completed.
In case the lookup_lock_entry
future output is valid, we call the hashmap_entry_set_state(...)
function with appropriate parameters. Finally, we copy the initialized future fut
to the
address pointed by the future
parameter using the memset(...)
function. future
points to the
future entry that we are currently initializing.
Summary
In summary, we have covered the future creation and initialization. We have also discussed the concept of chained futures and how we should initialize their entries. Lastly, we have talked over the lazy future initialization.
For more information about the Miniasync library, see its GitHub repository.