|
Monero
|

Classes | |
| struct | MDB_rxbody |
| struct | MDB_reader |
| struct | MDB_txbody |
| struct | MDB_txninfo |
Macros | |
| #define | DEFAULT_READERS 126 |
| #define | CACHELINE 64 |
| #define | MDB_LOCK_FORMAT |
| #define | MDB_LOCK_TYPE |
Typedefs | |
| typedef struct MDB_rxbody | MDB_rxbody |
| typedef struct MDB_reader | MDB_reader |
| typedef struct MDB_txbody | MDB_txbody |
| typedef struct MDB_txninfo | MDB_txninfo |
Enumerations | |
| enum | { MDB_lock_desc } |
Readers don't acquire any locks for their data access. Instead, they simply record their transaction ID in the reader table. The reader mutex is needed just to find an empty slot in the reader table. The slot's address is saved in thread-specific data so that subsequent read transactions started by the same thread need no further locking to proceed.
If MDB_NOTLS is set, the slot address is not saved in thread-specific data.
No reader table is used if the database is on a read-only filesystem, or if MDB_NOLOCK is set.
Since the database uses multi-version concurrency control, readers don't actually need any locking. This table is used to keep track of which readers are using data from which old transactions, so that we'll know when a particular old transaction is no longer in use. Old transactions that have discarded any data pages can then have those pages reclaimed for use by a later write transaction.
The lock table is constructed such that reader slots are aligned with the processor's cache line size. Any slot is only ever used by one thread. This alignment guarantees that there will be no contention or cache thrashing as threads update their own slot info, and also eliminates any need for locking when accessing a slot.
A writer thread will scan every slot in the table to determine the oldest outstanding reader transaction. Any freed pages older than this will be reclaimed by the writer. The writer doesn't use any locks when scanning this table. This means that there's no guarantee that the writer will see the most up-to-date reader info, but that's not required for correct operation - all we need is to know the upper bound on the oldest reader, we don't care at all about the newest reader. So the only consequence of reading stale information here is that old pages might hang around a while longer before being reclaimed. That's actually good anyway, because the longer we delay reclaiming old pages, the more likely it is that a string of contiguous pages can be found after coalescing old pages from many old transactions together.
| #define CACHELINE 64 |
The size of a CPU cache line in bytes. We want our lock structures aligned to this size to avoid false cache line sharing in the lock table. This value works for most CPUs. For Itanium this should be 128.
| #define DEFAULT_READERS 126 |
Number of slots in the reader table. This value was chosen somewhat arbitrarily. 126 readers plus a couple mutexes fit exactly into 8KB on my development machine. Applications should set the table size using mdb_env_set_maxreaders().
| #define MDB_LOCK_FORMAT |
Lockfile format signature: version, features and field layout
| #define MDB_LOCK_TYPE |
Lock type and layout. Values 0-119. _WIN32 implies MDB_PIDLOCK. Some low values are reserved for future tweaks.
| typedef struct MDB_reader MDB_reader |
The actual reader record, with cacheline padding.
| typedef struct MDB_rxbody MDB_rxbody |
The information we store in a single slot of the reader table. In addition to a transaction ID, we also record the process and thread ID that owns a slot, so that we can detect stale information, e.g. threads or processes that went away without cleaning up.
| typedef struct MDB_txbody MDB_txbody |
The header for the reader table. The table resides in a memory-mapped file. (This is a different file than is used for the main database.)
For POSIX the actual mutexes reside in the shared memory of this mapped file. On Windows, mutexes are named objects allocated by the kernel; we store the mutex names in this mapped file so that other processes can grab them. This same approach is also used on MacOSX/Darwin (using named semaphores) since MacOSX doesn't support process-shared POSIX mutexes. For these cases where a named object is used, the object name is derived from a 64 bit FNV hash of the environment pathname. As such, naming collisions are extremely unlikely. If a collision occurs, the results are unpredictable.
| typedef struct MDB_txninfo MDB_txninfo |
The actual reader table definition.
| anonymous enum |
| Enumerator | |
|---|---|
| MDB_lock_desc | Magic number for lockfile layout and features. This attempts to stop liblmdb variants compiled with conflicting options from using the lockfile at the same time and thus breaking it. It describes locking types, and sizes and sometimes alignment of the various lockfile items. The detected ranges are mostly guesswork, or based simply on how big they could be without using more bits. So we can tweak them in good conscience when updating MDB_LOCK_VERSION. |