C++ persistent containers
PMEM containers
Our goal for the libpmemobj C++ bindings is to create a friendly and less error prone API for persistent memory programming. Even with persistent memory pool allocators, convenient interface for creating and managing transactions, auto-snapshotting class templates and smart persistent pointers, designing an application with persistent memory usage may still prove challenging without a plethora of niceties that the C++ programmers are used to.
The natural step forward to make persistent programming easier, is to provide programmers with efficient and useful containers.
STL containers
Code reuse is a common programming principle, and we took it to heart when
sometime ago we experimented with persistent memory containers by introducing
custom persistent allocator for libc++
STL containers. You can read more about
this approach in this blog post.
As it turns out, there are some hard to overcome obstacles with using existing
STL containers. The two main downsides are:
- Implementation details: STL containers don’t use algorithms optimal from persistent memory programming point of view. PMEM containers should have durability and consistency properties, while not every STL method guarantee strong exception safety. Persistent memory containers should be designed with awareness of fragmentation limitations.
- Memory layout: STL does not guarantee that container layout will remain unchanged in new library versions.
Assumptions
We ultimately came to the conclusion that implementing containers with optimized on-media layouts and algorithms to fully exploit persistent memory potential should be one of our primary focuses. Their methods should guarantee atomicity, consistency and durability.
Beside specific internal implementation details, PMEM containers will have
well-known STL-like interface and will work with STL algorithms. Since they
will extend libpmemobj-cpp
project, the goal is to implement them with
usage of libpmemobj-cpp
bindings and make them easily accessible with its
interface.
Because of API similarities with STL, we are reusing ideas behind libc++
container tests and simply port them to our implementation and project.
Limitations
C++ language restrictions and the persistent memory programming paradigm imply some serious restrictions on objects, which may be stored on persistent medium. This topic deserves a separate blog post and should cover questions about language-based undefined behavior, objects lifetime and compiler compatibility - just to name a few of them.
You can expect blog post about this in nearby future, but for now I will just list the most important ideas:
- Implicit creation of objects stored in persistent medium (C++ standard states that lifetime of an object is a runtime property)
- memcpy-ing objects which don’t satisfy requirements of is_trivially_copyable (we are not calling neither constructors nor destructors during snapshotting memory areas)
- Objects representation (layout) might differ between compilers/compiler
flags/ABI. C++11 provides type trait
is_standard_layout
, butStandardLayoutType
is a very restrictive set of requirements (see here). - Usage of volatile memory pointers in persistent structures should be treated as a design error.
You can read more about template parameter type restrictions for
pmem::obj::persistent_ptr<T>
class in PMDK C++ bindings doxygen documentation.
Data structures
Implementing containers from scratch will be a long effort, hence our decision about specific containers types to be implemented and the order of their implementation is quite important.
We have already finished the first container - an array
. It is included in
pmem::obj::experimental
namespace and you can read more about it in
this blog post.
We have been working on implementation of vector
for some time and the next
scheduled containers are string
, map
and unordered_map
. We might change
this order, because we are considering an implementation of optimized vector,
based on idea of std::colony
container which will internally use
pmem::obj::vector
. This approach will reduce fragmentation factor, and gain
performance benefits similar to those presented in C++ standard library
proposal paper.
Usage
Persistent memory is an emerging technology, and as we mentioned in one of the previous blog posts, we truly believe that it will be paradigm shifting.
At this point, it is hard to predict specific use-cases for persistent memory containers. They can be used as persistent scratch pads, extension for in-memory databases or fast and flexible data storage. We also believe, that functional programming ideas can be used in implementation of persistent memory programming models.
We will continue our research efforts in area of purely functional persistent
data structures, concepts of filters and further optimizations (like
pmem::obj::slice
idea, also described in blog post about array).