This project is read-only.

General information

Why and when should I use it?

In cases where the data storage is required, but the file system is not suitable, and large DBMS`s have too much overhead. For example: query caches, message queues, large amount of temporary data etc.

Keys

Keys must implement the IComparable interface.
Serialization and deserialization methods should be provided. Binary representation of key has size limit, which depends on selected page size. Each index page must have a place for at least three keys in binary format. This requirement follows from the B+Tree properties.

Existing key always corresponds to a single value.

Values

Values ​​have no restrictions on type. Serialization and deserialization methods should also be provided. The size of value have no reasonable limit. In most cases large values will be limited by the drive size.

Creating storage

First we need to create a storage instance.

var factory = new StorageFactory();
var storage = factory.CreateStorage<int, string>( 
                    BitConverter.GetBytes,               // key serialization
                    p => BitConverter.ToInt32(p, 0),     // key deserialization
                    p => Encoding.UTF8.GetBytes(p),      // value serialization
                    p => Encoding.UTF8.GetString(p),     // value deserialization
                    StorageSettings.Default(sizeof(int)));

Here we provide types for keys and values, serialization methods and storage settings.
Please note that storage instance implements IDisposible interface. We should use storage instance in using block or call Dispose() in appropriate place to successfully shutdown.

To create storage on disk or open an already existing storage use one of

storage.OpenExisting(string path)
storage.CreateNew(string path)
storage.OpenOrCreate(string path)

These methods takes a directory string as a parameter.

On-disk storage structure consists of three files:
  • info - xml-file containing common information about storage
  • pagemap - the mapping of virtual page addresses to on-disk offsets
  • storage - the main storage file containing keys, values and index data

Operations

Storage instance returned by the Create() method supports following operations:
  • Get(TKey key) - gets the value by its key.
  • Set(TKey key, TValue value) - inserts or updates key value pair
  • Remove(TKey key) - removes key-value pair by key
  • Exists(TKey key) - cheks if key-value pair exists in tree
  • GetRawDataLength(TKey key) - retrieves the length (in bytes) of binary representation of the value referenced by the specified key.
  • GetRawDataSegment(TKey key, long startIndex, long endIndex) - retrieves a segment of binary representation of the value referenced by the specified key

If we need B+Tree specific operations
  • Min() - gets the minimal key.
  • Max() - gets the maximal key.
  • PreviousTo(TKey key) - gets the key previous to the specified key, the existence of the specified key is not required
  • NextTo(TKey key) - gets the key next to the specified key, the existence of the specified key is not required.
  • MinValue() - gets a value corresponing to the minimal key.
  • MaxValue() - gets the value corresponing to the maximal key
  • Count() - computes the count of key-value pairs in tree.
we have to perform a type cast to the IBPlusTreeKeyValueStorage<TKey, TValue>:
var storage = (IBPlusTreeKeyValueStorage<KeyOf<int>, ValueOf<string>>)factory.CreateStorage<int, string>( 
                    BitConverter.GetBytes,               // key serialization
                    p => BitConverter.ToInt32(p, 0),     // key deserialization
                    p => Encoding.UTF8.GetBytes(p),      // value serialization
                    p => Encoding.UTF8.GetString(p),     // value deserialization
                    StorageSettings.Default(sizeof(int)));

What about ACID?
  • atomicity - any single operation is atomic
  • consistency - any operation transfer storage to a consistent state
  • isolation - all single operations are isolated from each other
  • durability - durability of performed updates is achieved by calling the Flush() method

However, transactions in the sense of unit-of-work are not supported. Thus, we can not produce long series of changes, and then rollback or commit them.

Settings

When creating a storage instance, we have provided the StorageSettings instance. This object defines the general properties of creating storage, which affect disk and memory usage and are important for performance. We use default instance with custom key length passed as a parameter. However, other properties may be useful. Take a closer look:
  • PageSize - gets or sets size of the on-disk page (in bytes)
  • CacheSettings - gets or sets the page-level caching, set this to null to disable caching.
  • ForcedWrites - gets the value indicating whether all write operations perform immediatly to file storage. Setting to true disables only OS file system buffer on write operations. This does not affect the page caching behavior. Disable caching if you don't need it or call IStorage.Flush() to control the durability of all updates.
  • MaxKeySize - gets or sets the maximal length of key (in bytes)
  • MaxEmptyPages - gets or sets the maximal number of released pages. Vacuum procedure starts immediately when this value is reached.

Cache settings:
  • MaxCachedPages - gets or sets the maximum number of cached pages.
  • MaxDirtyPages - gets or sets the maximum number of updated pages.

Last edited Feb 14, 2014 at 7:10 AM by victor_scherbakov, version 9