WitDatabase provides three storage engines optimized for different workloads. Choosing the right engine for your application can significantly impact performance. This guide explains how each engine works, when to use it, and how to configure it for optimal results.


1. Overview

What is a Storage Engine?

A storage engine is the component responsible for physically storing and retrieving data on disk (or in memory). It determines how keys and values are organized, how quickly operations complete, and how much disk space is used.

WitDatabase's architecture separates the storage engine from higher layers (SQL, transactions, encryption), allowing you to choose the best engine for your workload without changing your application code.

Available Engines

WitDatabase offers three storage engines:

Engine Implementation Optimized For Storage Format
B+Tree StoreBTree General use, reads Single file
LSM-Tree StoreLsm Write-heavy workloads Directory with multiple files
In-Memory StoreInMemory Testing, caching RAM only

Architecture

All storage engines implement the same IKeyValueStore interface, making them interchangeable:

[[Svg Src="./witdatabase-ikeyvaluestore-engines.svg" Alt="Witdatabase Ikeyvaluestore Engines"]]

Key Characteristics Comparison

Before diving into details, here's a high-level comparison:

Characteristic B+Tree LSM-Tree In-Memory
Read Performance Excellent Good Fastest
Write Performance Good Excellent Fastest
Range Scan Excellent Very Good Excellent
Space Efficiency Very Good Good (needs compaction) N/A
Memory Usage Lower Higher Highest
Durability Yes Yes No
Crash Recovery Yes Yes No
Concurrency Good Very Good Excellent

Quick Selection Guide

Choose your engine based on your primary workload:

Choose B+Tree when:

  • Your application has mixed read/write patterns
  • You need predictable, consistent performance
  • Random access (point queries) is common
  • You prefer simple single-file storage
  • This is a general-purpose application

Choose LSM-Tree when:

  • Writes significantly outnumber reads
  • You're building a logging or event system
  • Data arrives in streams or batches
  • You can tolerate slightly slower reads for much faster writes
  • Background processing (compaction) is acceptable

Choose In-Memory when:

  • Data doesn't need to survive application restart
  • You're running unit tests
  • You need a temporary cache or scratchpad
  • Maximum speed is required and data fits in RAM

2. B+Tree Engine

The B+Tree engine is the default and recommended choice for most applications. It provides excellent read performance with good write performance, making it ideal for general-purpose workloads.

How B+Tree Works

A B+Tree is a self-balancing tree structure where all data is stored in leaf nodes, and internal nodes contain only keys for navigation. This design provides efficient lookups, insertions, and range scans.

[[Svg Src="./witdatabase-btree-structure.svg" Alt="Witdatabase Btree Structure"]]

Key properties of B+Tree:

  • All values are stored in leaf nodes
  • Leaf nodes are linked for fast sequential access
  • Tree is always balanced (all leaves at same depth)
  • Logarithmic time complexity: O(log n) for all operations

Page-Based Storage

The B+Tree engine organizes data into fixed-size pages (default 4KB):

Page Type Purpose
Header Page Database metadata, root page pointer
Internal Pages Navigation keys, child pointers
Leaf Pages Actual key-value data
Overflow Pages Large values that don't fit in leaf pages

When a page fills up, it splits into two pages and the tree rebalances. This keeps the tree height minimal (typically 3-4 levels for millions of records).

Configuration

Configure the B+Tree engine via connection string or builder pattern:

// Via connection string
"Data Source=app.witdb;Store=btree;Page Size=4096;Cache Size=2000"

// Via builder pattern
var db = new WitDatabaseBuilder()
    .WithFilePath("app.witdb")
    .WithBTree()
    .WithPageSize(4096)      // 4KB pages (default)
    .WithCacheSize(2000)     // 2000 pages in cache
    .Build();

Configuration Parameters

Parameter Default Range Description
Page Size 4096 4096-65536 Page size in bytes. Larger pages can improve sequential read performance but use more memory.
Cache Size 1000 1-100000 Number of pages to cache in memory. More cache = faster reads, more memory.
Cache clock clock, lru Cache eviction algorithm. Clock is better for concurrent access.

Page Size Selection

Choosing the right page size affects performance:

Page Size Best For Trade-offs
4 KB General use, small values Default, balanced performance
8 KB Medium-sized values Good for documents, JSON
16 KB Large values, sequential access Higher memory per page
32-64 KB Very large values, bulk reads Significant memory overhead

As a rule of thumb, choose a page size where most of your values fit within a single page. Values larger than the page size use overflow pages, which require additional I/O.

Performance Characteristics

The B+Tree engine excels at:

Point Queries (Get by key): The tree structure ensures O(log n) lookups regardless of database size. With a million records and a typical tree height of 4, finding any key requires at most 4 page reads.

Range Scans: Leaf nodes are linked together, so scanning a range of keys is very efficient after finding the start position. This makes the B+Tree ideal for queries like "all orders from last week" or "users with names starting with 'A'".

Random Writes: While not as fast as LSM-Tree for sequential writes, B+Tree handles random inserts well. Each insert requires O(log n) operations to find the correct position and potentially split pages.

Space Efficiency: B+Tree uses space efficiently because data is stored in-place. There's no write amplification from compaction like in LSM-Tree. Deleted space is reused immediately.

When B+Tree Excels

The B+Tree engine is the best choice for:

OLTP Workloads: Applications with many small transactions that read and write individual records. Think e-commerce, CRM, or user management systems.

Read-Heavy Applications: When reads significantly outnumber writes. The balanced tree structure ensures consistent read performance.

Applications Requiring Predictable Performance: B+Tree provides consistent latency without the occasional spikes that can occur during LSM-Tree compaction.

Single-File Deployment: The entire database is contained in one file, simplifying deployment and backup.

Practical Example

// Create a B+Tree database optimized for a user management system
var db = new WitDatabaseBuilder()
    .WithFilePath("users.witdb")
    .WithBTree()
    .WithPageSize(8192)        // 8KB pages for JSON user profiles
    .WithCacheSize(5000)       // Cache hot user data
    .WithTransactions()
    .Build();

var engine = new WitSqlEngine(db, ownsStore: true);

// Create table and indexes
engine.Execute(@"
    CREATE TABLE Users (
        Id BIGINT PRIMARY KEY,
        Email VARCHAR(255) UNIQUE,
        Profile TEXT,
        CreatedAt DATETIME DEFAULT NOW()
    )");
engine.Execute("CREATE INDEX IX_Users_CreatedAt ON Users(CreatedAt)");

// Point query - O(log n), very fast
var user = engine.QueryFirstOrDefault(
    "SELECT * FROM Users WHERE Id = @id",
    new Dictionary<string, object?> { ["id"] = 12345 });

// Range scan - efficient with linked leaf nodes
var recentUsers = engine.Query(
    "SELECT * FROM Users WHERE CreatedAt > @date ORDER BY CreatedAt",
    new Dictionary<string, object?> { ["date"] = DateTime.Today.AddDays(-7) });

3. LSM-Tree Engine

The LSM-Tree (Log-Structured Merge-Tree) engine is optimized for write-heavy workloads. It achieves extremely fast writes by buffering data in memory and writing sequentially to disk, at the cost of slightly slower reads.

How LSM-Tree Works

LSM-Tree separates write and read paths to optimize for writes:

[[Svg Src="./witdatabase-lsm-write-path.svg" Alt="Witdatabase lsm write path"]]

Write process:

  1. Write is appended to WAL (for crash recovery)
  2. Write is inserted into MemTable (in-memory)
  3. When MemTable is full, it's flushed to disk as an SSTable
  4. Background compaction merges SSTables to maintain read performance

Read process:

  1. Check MemTable first (newest data)
  2. Check Bloom filters of SSTables (skip files that definitely don't have the key)
  3. Search SSTables from newest to oldest
  4. First match wins (newer data shadows older)

Key Components

MemTable: An in-memory sorted data structure (typically a skip list or red-black tree) that buffers recent writes. Writes to MemTable are O(log n) but very fast since they're in memory.

SSTable (Sorted String Table): An immutable, sorted file containing key-value pairs. Once written, SSTables are never modified — they're only created by flushing MemTable or merging during compaction.

Write-Ahead Log (WAL): A sequential log of all writes. If the system crashes, the WAL is replayed to recover data that was in MemTable but not yet flushed.

Bloom Filter: A probabilistic data structure that quickly tells you if a key is definitely NOT in an SSTable. This dramatically reduces unnecessary disk reads.

Compaction: A background process that merges multiple SSTables into larger ones, removing duplicate and deleted keys. This reclaims space and maintains read performance.

Configuration

Configure the LSM-Tree engine for your workload:

// Via connection string
"Data Source=./logs;Store=lsm;MemTable Size=8388608;Block Cache Size=134217728"

// Via builder pattern
var db = new WitDatabaseBuilder()
    .WithLsmTree("./events", options =>
    {
        options.MemTableSizeLimit = 8 * 1024 * 1024;     // 8 MB MemTable
        options.EnableWal = true;                         // Enable WAL
        options.EnableBlockCache = true;                  // Enable read cache
        options.BlockCacheSizeBytes = 128 * 1024 * 1024; // 128 MB cache
        options.BackgroundCompaction = true;              // Auto-compact
        options.CompactionTrigger = 4;                    // Compact at 4 L0 files
    })
    .Build();

Configuration Parameters

Parameter Default Description
MemTable Size 4 MB Size before flushing to disk. Larger = fewer flushes, more memory
Block Cache Size 64 MB Read cache for SSTable blocks. Larger = faster reads
Sync Writes true Sync WAL on each write. Disable for speed (risk data loss)
Background Compaction true Automatic compaction. Disable for manual control
LSM WAL true Write-ahead logging. Disable only if durability isn't needed
LSM Block Size 4096 SSTable block size. Affects read granularity
LSM Compaction Trigger 4 Number of L0 files before compaction starts

Compaction

Compaction is the process of merging SSTables to maintain read performance and reclaim space:

[[Svg Src="./witdatabase-lsm-compaction.svg" Alt="Witdatabase lsm compaction"]]

What compaction does:

  • Merges overlapping key ranges
  • Removes deleted keys (tombstones)
  • Removes older versions of updated keys
  • Creates larger, more efficient files

Compaction trade-offs:

  • Uses CPU and disk I/O in the background
  • Temporarily uses extra disk space during merge
  • Can cause latency spikes if it falls behind

For most applications, leave Background Compaction = true. The engine handles scheduling automatically.

Bloom Filters

Bloom filters are a key optimization that makes LSM-Tree reads much faster:

Query: Get("user:12345")

Without Bloom Filter:
  SSTable-001: Read index, search... not found ✗
  SSTable-002: Read index, search... not found ✗
  SSTable-003: Read index, search... FOUND! ✓
  (3 disk operations)

With Bloom Filter:
  SSTable-001: Bloom says "definitely no" → skip
  SSTable-002: Bloom says "definitely no" → skip
  SSTable-003: Bloom says "maybe yes" → search... FOUND! ✓
  (1 disk operation)

Bloom filters are built automatically when SSTables are created. They add minimal overhead (~1% of SSTable size) but can eliminate 99%+ of unnecessary disk reads.

Performance Characteristics

The LSM-Tree engine excels at:

Sequential Writes: Writes go to memory (MemTable) and are flushed sequentially. No random I/O for writes means extremely high write throughput.

Bulk Inserts: Loading large amounts of data is very fast. The sequential nature of writes saturates disk bandwidth efficiently.

Write-Heavy Workloads: Systems that write much more than they read (logging, events, metrics) benefit greatly from LSM-Tree's design.

Space Reclamation: Compaction efficiently removes deleted and old data, which is important for time-series data with retention policies.

When LSM-Tree Excels

The LSM-Tree engine is the best choice for:

Logging and Event Systems: Applications that continuously write events, logs, or metrics. The write path is optimized for this pattern.

Time-Series Data: IoT sensors, application metrics, financial tick data. LSM-Tree handles the continuous stream of writes efficiently.

Event Sourcing: Systems that store every change as an immutable event. LSM-Tree's append-only nature aligns well with this pattern.

Bulk Data Loading: ETL jobs, data migrations, or any scenario where you're loading large amounts of data quickly.

Practical Example

// Create an LSM-Tree database for an event logging system
var db = new WitDatabaseBuilder()
    .WithLsmTree("./event-store", options =>
    {
        options.MemTableSizeLimit = 16 * 1024 * 1024;    // 16 MB - larger for high throughput
        options.BlockCacheSizeBytes = 256 * 1024 * 1024; // 256 MB cache
        options.BackgroundCompaction = true;
        options.EnableWal = true;
    })
    .Build();

var engine = new WitSqlEngine(db, ownsStore: true);

// Create event table
engine.Execute(@"
    CREATE TABLE Events (
        Id BIGINT PRIMARY KEY,
        Timestamp DATETIME NOT NULL,
        EventType VARCHAR(50),
        Payload TEXT,
        INDEX IX_Events_Timestamp (Timestamp)
    )");

// High-throughput event insertion
engine.Execute("BEGIN TRANSACTION");

var stmt = engine.Prepare(
    "INSERT INTO Events (Id, Timestamp, EventType, Payload) VALUES (@id, @ts, @type, @payload)");

for (int i = 0; i < 100000; i++)
{
    stmt.SetParameters(new Dictionary<string, object?>
    {
        ["id"] = i,
        ["ts"] = DateTime.UtcNow,
        ["type"] = "user.action",
        ["payload"] = 
Loading...
quot;{{\"action\": \"click\", \"target\": \"button-{i}\"}}" }); stmt.Execute(); } engine.Execute("COMMIT"); // Time-range query (benefits from Bloom filters) var recentEvents = engine.Query( "SELECT * FROM Events WHERE Timestamp > @start ORDER BY Timestamp", new Dictionary<string, object?> { ["start"] = DateTime.UtcNow.AddHours(-1) });

LSM-Tree vs B+Tree Trade-offs

Aspect LSM-Tree B+Tree
Write latency Lower (memory + sequential) Higher (random I/O)
Read latency Higher (multiple files) Lower (single tree)
Write throughput Much higher Moderate
Read throughput Good Excellent
Space amplification Higher (until compaction) Lower
Write amplification Lower Higher
Predictability Less (compaction spikes) More consistent
Storage format Directory with files Single file

4. In-Memory Engine

The In-Memory engine stores all data in RAM with no disk persistence. It provides the fastest possible performance for scenarios where durability isn't required.

How In-Memory Works

The In-Memory engine is straightforward: data lives in RAM using efficient data structures, with no disk I/O:

[[Svg Src="./witdatabase-inmemory-store.svg" Alt="Witdatabase Inmemory Store"]]

Key characteristics:

  • All operations are in-memory — no disk latency
  • Data is lost when the database is closed or the process exits
  • Memory usage equals data size plus overhead
  • Supports the same API as persistent engines

Configuration

The In-Memory engine requires minimal configuration:

// Via connection string
"Data Source=:memory:"

// Via builder pattern (simple)
var db = WitDatabase.CreateInMemory();

// Via builder pattern (with options)
var db = new WitDatabaseBuilder()
    .WithMemoryStorage()
    .WithBTree()           // Use B+Tree structure in memory
    .WithTransactions()    // Transactions still work
    .Build();

You can also combine In-Memory with encryption for secure temporary storage:

// Encrypted in-memory database
var db = WitDatabase.CreateInMemory("my-password");

// Or via builder
var db = new WitDatabaseBuilder()
    .WithMemoryStorage()
    .WithBTree()
    .WithEncryption("my-password")
    .Build();

Performance Characteristics

The In-Memory engine provides the fastest possible operations:

Operation Typical Latency Notes
Point read (Get) < 1 µs Direct memory access
Point write (Put) < 1 µs No disk sync
Range scan < 1 µs per key Sequential memory access
Bulk insert (100K) < 100 ms Memory bandwidth limited

Comparison with persistent engines (relative performance):

Operation In-Memory B+Tree (File) LSM-Tree
Random reads 1x (baseline) 10-100x slower 20-200x slower
Random writes 1x (baseline) 50-500x slower 5-50x slower
Sequential scan 1x (baseline) 5-20x slower 10-50x slower

Note: Actual performance depends on whether file-based engines have data cached.

Use Cases

The In-Memory engine is ideal for:

Unit Testing: Tests run faster without disk I/O, and each test gets a fresh database automatically.

[Test]
public void TestUserCreation()
{
    // Fresh in-memory database for each test
    using var db = WitDatabase.CreateInMemory();
    using var engine = new WitSqlEngine(db, ownsStore: true);
    
    engine.Execute("CREATE TABLE Users (Id INT PRIMARY KEY, Name VARCHAR(100))");
    engine.Execute("INSERT INTO Users VALUES (1, 'Test User')");
    
    var user = engine.QueryFirstOrDefault("SELECT * FROM Users WHERE Id = 1");
    Assert.IsNotNull(user);
    Assert.AreEqual("Test User", user["Name"].AsString());
}
// Database automatically disposed after test

Integration Testing: Test your data access layer with a real database engine but without filesystem dependencies.

public class TestFixture
{
    private WitDatabase _db;
    private WitSqlEngine _engine;
    
    [SetUp]
    public void Setup()
    {
        _db = WitDatabase.CreateInMemory();
        _engine = new WitSqlEngine(_db, ownsStore: false);
        
        // Apply schema
        _engine.Execute(File.ReadAllText("schema.sql"));
        
        // Load test data
        _engine.Execute(File.ReadAllText("test-data.sql"));
    }
    
    [TearDown]
    public void Teardown()
    {
        _engine.Dispose();
        _db.Dispose();
    }
}

Caching Layer: Use as a high-performance cache with SQL query capabilities.

public class SqlCache
{
    private readonly WitSqlEngine _engine;
    
    public SqlCache()
    {
        var db = WitDatabase.CreateInMemory();
        _engine = new WitSqlEngine(db, ownsStore: true);
        
        _engine.Execute(@"
            CREATE TABLE Cache (
                Key VARCHAR(255) PRIMARY KEY,
                Value TEXT,
                ExpiresAt DATETIME
            )");
    }
    
    public void Set(string key, string value, TimeSpan ttl)
    {
        _engine.ExecuteNonQuery(@"
            INSERT OR REPLACE INTO Cache (Key, Value, ExpiresAt) 
            VALUES (@key, @value, @expires)",
            new Dictionary<string, object?>
            {
                ["key"] = key,
                ["value"] = value,
                ["expires"] = DateTime.UtcNow.Add(ttl)
            });
    }
    
    public string? Get(string key)
    {
        var row = _engine.QueryFirstOrDefault(@"
            SELECT Value FROM Cache 
            WHERE Key = @key AND ExpiresAt > @now",
            new Dictionary<string, object?>
            {
                ["key"] = key,
                ["now"] = DateTime.UtcNow
            });
        
        return row?["Value"].AsStringOrNull();
    }
}

Temporary Processing: Hold intermediate results during complex operations.

public class ReportGenerator
{
    public Report GenerateReport(IEnumerable<RawData> data)
    {
        using var db = WitDatabase.CreateInMemory();
        using var engine = new WitSqlEngine(db, ownsStore: true);
        
        // Create temp tables
        engine.Execute("CREATE TABLE TempData (...)");
        engine.Execute("CREATE TABLE TempAggregates (...)");
        
        // Load and transform data
        var stmt = engine.Prepare("INSERT INTO TempData ...");
        foreach (var item in data)
        {
            stmt.Execute(/* params */);
        }
        
        // Run complex aggregations with SQL
        engine.Execute(@"
            INSERT INTO TempAggregates
            SELECT Category, SUM(Amount), AVG(Price)
            FROM TempData
            GROUP BY Category");
        
        // Build report from aggregates
        var results = engine.Query("SELECT * FROM TempAggregates");
        return BuildReport(results);
    }
    // In-memory database disposed, memory freed
}

Memory Considerations

When using the In-Memory engine, keep in mind:

Memory Usage: Data plus overhead (indexes, structure). A database with 1 GB of raw data might use 1.2-1.5 GB of memory.

No Persistence: All data is lost on dispose. If you need to preserve data, export it first or use a persistent engine.

Garbage Collection: Large in-memory databases can impact .NET garbage collection. Consider using GC.Collect() after disposing if memory pressure is a concern.

Concurrency: In-memory operations are fast, but thread safety rules still apply. Use transactions for consistency.

Limitations

The In-Memory engine has some limitations compared to persistent engines:

Feature In-Memory Persistent (B+Tree/LSM)
Data survives restart ❌ No ✅ Yes
Crash recovery ❌ No ✅ Yes
Data larger than RAM ❌ No ✅ Yes
Multiple processes ❌ No ✅ Yes (file locking)
Backup/Restore ❌ Not directly ✅ Yes

5. Choosing the Right Engine

Selecting the right storage engine is one of the most important decisions for your application's performance. This section provides a decision framework to help you choose.

Decision Flowchart

[[Svg Src="./witdatabase-storage-engine-decision.svg" Alt="Witdatabase Storage Engine Decision"]]

Workload Patterns

Different workloads have different characteristics. Match your workload to the right engine:

Workload Pattern Description Recommended Engine
OLTP Many small transactions, mixed reads/writes B+Tree
Logging Continuous writes, occasional reads LSM-Tree
Time-series Timestamped data, range queries LSM-Tree
Key-value cache Fast lookups, moderate updates B+Tree or In-Memory
Session storage Short-lived data, fast access In-Memory
Event sourcing Append-only writes LSM-Tree
Analytics Large scans, complex queries B+Tree
Testing Isolated, fast, disposable In-Memory

Real-World Scenarios

Here are concrete recommendations for common application types:

Web Application (E-commerce, CRM, SaaS)

Choose: B+Tree

Rationale: Mixed read/write workload, need for consistent performance, transactional integrity. Users expect fast page loads (reads) and reliable checkout/updates (writes).

var db = new WitDatabaseBuilder()
    .WithFilePath("app.witdb")
    .WithBTree()
    .WithCacheSize(5000)      // Large cache for product catalog
    .WithTransactions()
    .WithMvcc()               // Handle concurrent users
    .Build();

Logging Service

Choose: LSM-Tree

Rationale: Write-heavy workload (logs arriving constantly), sequential writes, reads are less frequent (when investigating issues).

var db = new WitDatabaseBuilder()
    .WithLsmTree("./logs", options =>
    {
        options.MemTableSizeLimit = 32 * 1024 * 1024;  // 32 MB buffer
        options.BackgroundCompaction = true;
    })
    .Build();

IoT Data Collection

Choose: LSM-Tree

Rationale: Continuous sensor data streams, very high write throughput, time-range queries for analysis.

var db = new WitDatabaseBuilder()
    .WithLsmTree("./sensor-data", options =>
    {
        options.MemTableSizeLimit = 64 * 1024 * 1024;   // Large buffer
        options.BlockCacheSizeBytes = 256 * 1024 * 1024; // Cache for queries
    })
    .Build();

Desktop Application (Note-taking, Document Management)

Choose: B+Tree

Rationale: Single user, mixed operations, need for single-file deployment, reliable storage.

var appData = Environment.GetFolderPath(Environment.SpecialFolder.LocalApplicationData);
var db = new WitDatabaseBuilder()
    .WithFilePath(Path.Combine(appData, "MyApp", "data.witdb"))
    .WithBTree()
    .WithEncryption(userPassword)
    .Build();

Test Suite

Choose: In-Memory

Rationale: Speed, isolation between tests, no cleanup needed, no filesystem dependencies.

[TestFixture]
public class RepositoryTests
{
    [Test]
    public async Task CreateUser_ShouldPersist()
    {
        await using var db = WitDatabase.CreateInMemory();
        var repo = new UserRepository(db);
        
        var user = await repo.CreateAsync("test@example.com");
        
        Assert.IsNotNull(user.Id);
    }
}

Real-time Analytics Dashboard

Choose: B+Tree with large cache

Rationale: Read-heavy (dashboard queries), periodic writes (data imports), complex aggregations benefit from consistent read performance.

var db = new WitDatabaseBuilder()
    .WithFilePath("analytics.witdb")
    .WithBTree()
    .WithCacheSize(20000)     // Very large cache
    .WithPageSize(8192)       // Larger pages for scans
    .Build();

Migration Between Engines

You can change storage engines, but it requires data migration:

// Export from B+Tree
using var sourceDb = WitDatabase.Open("source-btree.witdb");
using var sourceEngine = new WitSqlEngine(sourceDb, ownsStore: false);

// Create LSM-Tree destination
using var destDb = new WitDatabaseBuilder()
    .WithLsmTree("./dest-lsm")
    .Build();
using var destEngine = new WitSqlEngine(destDb, ownsStore: false);

// Copy schema
var tables = sourceEngine.Query("SELECT name FROM sqlite_master WHERE type='table'");
foreach (var table in tables)
{
    var tableName = table["name"].AsString();
    var createSql = sourceEngine.ExecuteScalar(
        
Loading...
quot;SELECT sql FROM sqlite_master WHERE name = '{tableName}'").AsString(); destEngine.Execute(createSql); } // Copy data table by table foreach (var table in tables) { var tableName = table["name"].AsString(); var rows = sourceEngine.Query(
Loading...
quot;SELECT * FROM {tableName}"); if (rows.Count > 0) { var columns = rows[0].ColumnNames; var insertSql =
Loading...
quot;INSERT INTO {tableName} ({string.Join(", ", columns)}) VALUES ({string.Join(", ", columns.Select(c =>
Loading...
quot;@{c}"))})"; using var stmt = destEngine.Prepare(insertSql); foreach (var row in rows) { var parameters = columns.ToDictionary(c => c, c => row[c].Value); stmt.Execute(parameters); } } }

Hybrid Approaches

Sometimes you need different engines for different data:

// Main application data (B+Tree)
var mainDb = new WitDatabaseBuilder()
    .WithFilePath("app.witdb")
    .WithBTree()
    .Build();

// Audit log (LSM-Tree)
var auditDb = new WitDatabaseBuilder()
    .WithLsmTree("./audit-logs")
    .Build();

// Session cache (In-Memory)
var sessionDb = WitDatabase.CreateInMemory();

This hybrid approach gives you:

  • Fast, consistent access to main data (B+Tree)
  • High-throughput audit logging (LSM-Tree)
  • Ultra-fast session lookups (In-Memory)

6. Performance Comparison

Understanding the performance characteristics of each engine helps you set realistic expectations and optimize your application.

Benchmark Overview

These benchmarks were run on typical hardware (Intel Core i7, NVMe SSD, 16GB RAM). Your results may vary based on hardware, data patterns, and configuration.

Write Performance

Writing 100,000 key-value pairs (each ~100 bytes):

Engine Time Throughput Notes
In-Memory 45 ms 2.2M ops/sec Limited by CPU/memory bandwidth
LSM-Tree 180 ms 555K ops/sec Sequential writes to MemTable
B+Tree 850 ms 117K ops/sec Random I/O, page splits

With WAL disabled (LSM-Tree) or transactions disabled (B+Tree), write performance improves but durability is lost:

Engine With Durability Without Durability
LSM-Tree 180 ms 120 ms
B+Tree 850 ms 350 ms

Read Performance

Reading 100,000 random keys (after data is loaded):

Engine Cold Cache Warm Cache Notes
In-Memory N/A 25 ms Always "warm"
B+Tree 450 ms 80 ms Single tree traversal
LSM-Tree 650 ms 150 ms Multiple files, Bloom filters help

The "cold cache" scenario is relevant when data doesn't fit in memory. With warm cache, B+Tree approaches In-Memory performance.

Range Scan Performance

Scanning 10,000 consecutive keys:

Engine Time Notes
In-Memory 5 ms Sequential memory access
B+Tree 15 ms Linked leaf nodes
LSM-Tree 35 ms Merge across SSTables

B+Tree excels at range scans because leaf nodes are linked. LSM-Tree must merge results from multiple files.

Bulk Insert Performance

Inserting 1 million records in a single transaction:

Engine Time Throughput
In-Memory 2.1 s 476K ops/sec
LSM-Tree 4.5 s 222K ops/sec
B+Tree 18 s 55K ops/sec

LSM-Tree's sequential write pattern shines for bulk loads. B+Tree's page splits become a bottleneck.

Memory Usage

Memory consumption for 1 million key-value pairs (~100 bytes each):

Engine Data on Disk Memory Usage Notes
In-Memory N/A ~150 MB Data + structure overhead
B+Tree ~120 MB 8-40 MB Cache size configurable
LSM-Tree ~130 MB 70-150 MB MemTable + Block Cache

Summary Table

Metric B+Tree LSM-Tree In-Memory
Random write ⭐⭐⭐ ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐
Sequential write ⭐⭐⭐ ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐
Random read ⭐⭐⭐⭐⭐ ⭐⭐⭐ ⭐⭐⭐⭐⭐
Range scan ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐ ⭐⭐⭐⭐⭐
Memory efficiency ⭐⭐⭐⭐⭐ ⭐⭐⭐ ⭐⭐
Durability ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐
Predictability ⭐⭐⭐⭐⭐ ⭐⭐⭐ ⭐⭐⭐⭐⭐

7. Quick Reference

Engine Selection

Scenario Engine Reason
General purpose B+Tree Balanced performance
Write-heavy (>80% writes) LSM-Tree Optimized for writes
Read-heavy (>80% reads) B+Tree Fast point queries
Logging / Events LSM-Tree Sequential writes
Testing In-Memory Speed, isolation
Caching In-Memory or B+Tree Depends on persistence needs
Time-series LSM-Tree Continuous writes
OLTP B+Tree Mixed workload, consistency

Configuration Quick Reference

B+Tree Configuration:

// Connection string
"Data Source=app.witdb;Store=btree;Page Size=4096;Cache Size=2000"

// Builder pattern
var db = new WitDatabaseBuilder()
    .WithFilePath("app.witdb")
    .WithBTree()
    .WithPageSize(4096)
    .WithCacheSize(2000)
    .Build();

LSM-Tree Configuration:

// Connection string
"Data Source=./data;Store=lsm;MemTable Size=8388608;Block Cache Size=134217728"

// Builder pattern
var db = new WitDatabaseBuilder()
    .WithLsmTree("./data", options =>
    {
        options.MemTableSizeLimit = 8 * 1024 * 1024;
        options.BlockCacheSizeBytes = 128 * 1024 * 1024;
        options.BackgroundCompaction = true;
    })
    .Build();

In-Memory Configuration:

// Connection string
"Data Source=:memory:"

// Builder pattern
var db = WitDatabase.CreateInMemory();

// With encryption
var db = WitDatabase.CreateInMemory("password");

Parameter Reference

B+Tree Parameters:

Parameter Default Description
Page Size 4096 Page size in bytes (4096-65536)
Cache Size 1000 Number of pages to cache
Cache clock Cache algorithm (clock, lru)

LSM-Tree Parameters:

Parameter Default Description
MemTable Size 4 MB MemTable size before flush
Block Cache Size 64 MB Read cache size
Sync Writes true Sync WAL on each write
Background Compaction true Auto-compact SSTables
LSM WAL true Enable write-ahead log
LSM Block Size 4096 SSTable block size
LSM Compaction Trigger 4 L0 files before compaction

Storage Format

Engine Format Files
B+Tree Single file app.witdb
LSM-Tree Directory ./data/ with .sst, .wal files
In-Memory None No files

Feature Support

Feature B+Tree LSM-Tree In-Memory
Transactions
MVCC
Encryption
Crash recovery
File locking N/A
Blazor WASM