An introduction to pmemobj (part 2) - transactions
By now you should be fairly familiar with the basics persistent memory programming, but to make sure the application is always in a consistent state you had to rely on your own solutions and tricks - like the length of a buffer in the previous example. Now, we will learn a generic solution provided by pmemobj to this type of problems - transactions. For now we will focus on a single-threaded applications with no locking.
Transactions are managed by the usage of
pmemobj_tx_* family of functions. A single transaction goes through a series of stages listed in
enum pobj_tx_stage and illustrated by the following diagram:
You can see here how to use each of the stage-managing functions. The
pmemobj_tx_process function can be used instead of others to move the transaction forward - you can call it if you don’t know in which stage you are currently in. All of this can get fairly complicated, for more information please check out the manpage. To avoid having to micro-manage this entire process the pmemobj library provides a set of macros that are built on top of these functions that greatly simplify using the transactions and this tutorial will exclusively use them.
So, this is how an entire transaction block looks like:
As you can see this is very closely correlated with the lifecycle diagram, and pretty much self-explanatory. All of the code blocks apart from the
TX_END are optional. You can also nest transactions without any limits, and I guess recursive transactions are also technically OK. If a nested transaction aborts, the entire transaction aborts.
You might wonder why there’s the
TX_FINALLY stage, why not just execute that code after the transaction block - well, due to the way our transactions work (
setjmp at the begin and
longjmp on all aborts), you are not guaranteed that the code that is directly after the
TX_END in a nested transaction is going to execute at all. Consider the following code:
This snippet has a memory leak. The
free will never be called, because the
do_work will eventually make a
longjmp back to the outer transaction. The correct way of implementing the
do_work is to use
This is OK because it’s guaranteed that the finally block will always be executed.
Please also note the usage of
volatile-qualified variable in TX_FINALLY block.
This is because local non-volatile qualified objects have undefined values after
execution of longjmp if their value have changed after setjmp. So in the case
of libpmemobj transaction blocks every local variable modified in TX_STAGE_WORK and used in
TX_STAGE_ONABORT/TX_STAGE_FINALLY needs to be volatile-qualified - otherwise you might
encounter undefined behavior. Please see the CAVEATS section in libpmemobj
manpage for more information.
Our library distinguishes 3 different transactional operations: allocation, free and set. Right now we will learn only about the last one, which - as the name suggests, is used to safely set a memory block to some value. This is realized by 2 API functions:
pmemobj_tx_add_range_direct. Quoting the documentation:
Takes a “snapshot” of the memory block … and saves it in the undo log. The application is then free to directly modify the object in that memory range. In case of failure or abort, all the changes within this range will be rolled-back automatically.
What this means is that when you call any of those two functions, a new object is allocated and the existing content of the memory range is copied into it. Unless the library needs that old memory in the transaction rollback, that object will be discarded.
Also, note that the library assumes that when you add the memory range you intend to write to it and the memory is automatically persisted when committing the transaction - so you don’t have to call
So how to use those functions? The
pmemobj_tx_add_range takes a raw persistent memory pointer (PMEMoid), an offset from it and its size. So let’s set some values inside this structure:
The naive way of doing it might look like this:
But this isn’t very optimal - you are adding three objects to the undo log. I think it’s sufficient to say that a single undo log entry has size equal to, at minimum, 128 bytes. It is way better to just add the entire object at once:
This way you don’t waste unnecessary memory for metadata and the effect will be exactly the same. And it even looks better.
pmemobj_tx_add_range_direct does the same thing, but in a more convenient way for some uses. It takes a direct reference to a field and its size, for example:
This is useful when you don’t have an easy way of accessing the PMEMoid this memory block belongs to.
Conditional transaction blocks
It might seem that
TX_ONABORT explanation isn’t really required, one is called when the transaction commits and the other one when it aborts - simple as that. As long as there are no inner transactions, that is true. But once we start nesting, things get a little bit more complicated. Consider the following example:
So, a hashmap with entries in persistent memory but with a volatile table containing them. Is this code correct? Well, the
hash_set on its own is perfectly OK - but not inside another transaction. When the
pmemobj_tx_abort function is called everything in the
TX_BEGIN block is reverted but the
TX_ONCOMMIT of the nested transaction was already executed (and the
TX_ONABORT won’t be called in that function), the end result is an invalid persistent pointer in the volatile table. This is generally difficult to solve and requires per-problem solution - I recommend designing your applications to actively avoid it. For this specific use case you can have an extra
hash_revert_previous function that is called from the
TX_ONABORT block of the outer-most transaction.
The intended use of the
TX_ONABORT is to print log information and set return variable of the function with nested transaction, like so:
We are going to modify the previous example to use a transaction instead of storing the length of a buffer. Open the
writer.c and modify the lines after scanf to look like this:
And that’s about it, looks simpler right? And more similar to how a volatile program might do this. The
reader.c doesn’t change much, just remove the
if statement that checks the buffer length and you are good to go. If you are having trouble correctly modifying the code, you can find the complete example here.
This can be further simplified by combining the two lines inside the transaction together - but that’s the topic for our next post.