Storage Engines - WitDatabase

WitDatabase provides three storage engines optimized for different workloads. Choosing the right engine for your application can significantly impact performance. This guide explains how each engine works, when to use it, and how to configure it for optimal results.

1. Overview

What is a Storage Engine?

A storage engine is the component responsible for physically storing and retrieving data on disk (or in memory). It determines how keys and values are organized, how quickly operations complete, and how much disk space is used.

WitDatabase's architecture separates the storage engine from higher layers (SQL, transactions, encryption), allowing you to choose the best engine for your workload without changing your application code.

Available Engines

WitDatabase offers three storage engines:

Engine	Implementation	Optimized For	Storage Format
B+Tree	`StoreBTree`	General use, reads	Single file
LSM-Tree	`StoreLsm`	Write-heavy workloads	Directory with multiple files
In-Memory	`StoreInMemory`	Testing, caching	RAM only

Architecture

All storage engines implement the same IKeyValueStore interface, making them interchangeable:

[[Svg Src="./witdatabase-ikeyvaluestore-engines.svg" Alt="Witdatabase Ikeyvaluestore Engines"]]

Key Characteristics Comparison

Before diving into details, here's a high-level comparison:

Characteristic	B+Tree	LSM-Tree	In-Memory
Read Performance	Excellent	Good	Fastest
Write Performance	Good	Excellent	Fastest
Range Scan	Excellent	Very Good	Excellent
Space Efficiency	Very Good	Good (needs compaction)	N/A
Memory Usage	Lower	Higher	Highest
Durability	Yes	Yes	No
Crash Recovery	Yes	Yes	No
Concurrency	Good	Very Good	Excellent

Quick Selection Guide

Choose your engine based on your primary workload:

Choose B+Tree when:

Your application has mixed read/write patterns
You need predictable, consistent performance
Random access (point queries) is common
You prefer simple single-file storage
This is a general-purpose application

Choose LSM-Tree when:

Writes significantly outnumber reads
You're building a logging or event system
Data arrives in streams or batches
You can tolerate slightly slower reads for much faster writes
Background processing (compaction) is acceptable

Choose In-Memory when:

Data doesn't need to survive application restart
You're running unit tests
You need a temporary cache or scratchpad
Maximum speed is required and data fits in RAM

2. B+Tree Engine

The B+Tree engine is the default and recommended choice for most applications. It provides excellent read performance with good write performance, making it ideal for general-purpose workloads.

How B+Tree Works

A B+Tree is a self-balancing tree structure where all data is stored in leaf nodes, and internal nodes contain only keys for navigation. This design provides efficient lookups, insertions, and range scans.

[[Svg Src="./witdatabase-btree-structure.svg" Alt="Witdatabase Btree Structure"]]

Key properties of B+Tree:

All values are stored in leaf nodes
Leaf nodes are linked for fast sequential access
Tree is always balanced (all leaves at same depth)
Logarithmic time complexity: O(log n) for all operations

Page-Based Storage

The B+Tree engine organizes data into fixed-size pages (default 4KB):

Page Type	Purpose
Header Page	Database metadata, root page pointer
Internal Pages	Navigation keys, child pointers
Leaf Pages	Actual key-value data
Overflow Pages	Large values that don't fit in leaf pages

When a page fills up, it splits into two pages and the tree rebalances. This keeps the tree height minimal (typically 3-4 levels for millions of records).

Configuration

Configure the B+Tree engine via connection string or builder pattern:

// Via connection string
"Data Source=app.witdb;Store=btree;Page Size=4096;Cache Size=2000"

// Via builder pattern
var db = new WitDatabaseBuilder()
    .WithFilePath("app.witdb")
    .WithBTree()
    .WithPageSize(4096)      // 4KB pages (default)
    .WithCacheSize(2000)     // 2000 pages in cache
    .Build();

Configuration Parameters

Parameter	Default	Range	Description
`Page Size`	4096	4096-65536	Page size in bytes. Larger pages can improve sequential read performance but use more memory.
`Cache Size`	1000	1-100000	Number of pages to cache in memory. More cache = faster reads, more memory.
`Cache`	clock	clock, lru	Cache eviction algorithm. Clock is better for concurrent access.

Page Size Selection

Choosing the right page size affects performance:

Page Size	Best For	Trade-offs
4 KB	General use, small values	Default, balanced performance
8 KB	Medium-sized values	Good for documents, JSON
16 KB	Large values, sequential access	Higher memory per page
32-64 KB	Very large values, bulk reads	Significant memory overhead

As a rule of thumb, choose a page size where most of your values fit within a single page. Values larger than the page size use overflow pages, which require additional I/O.

Performance Characteristics

The B+Tree engine excels at:

Point Queries (Get by key): The tree structure ensures O(log n) lookups regardless of database size. With a million records and a typical tree height of 4, finding any key requires at most 4 page reads.

Range Scans: Leaf nodes are linked together, so scanning a range of keys is very efficient after finding the start position. This makes the B+Tree ideal for queries like "all orders from last week" or "users with names starting with 'A'".

Random Writes: While not as fast as LSM-Tree for sequential writes, B+Tree handles random inserts well. Each insert requires O(log n) operations to find the correct position and potentially split pages.

Space Efficiency: B+Tree uses space efficiently because data is stored in-place. There's no write amplification from compaction like in LSM-Tree. Deleted space is reused immediately.

When B+Tree Excels

The B+Tree engine is the best choice for:

OLTP Workloads: Applications with many small transactions that read and write individual records. Think e-commerce, CRM, or user management systems.

Read-Heavy Applications: When reads significantly outnumber writes. The balanced tree structure ensures consistent read performance.

Applications Requiring Predictable Performance: B+Tree provides consistent latency without the occasional spikes that can occur during LSM-Tree compaction.

Single-File Deployment: The entire database is contained in one file, simplifying deployment and backup.

Practical Example

// Create a B+Tree database optimized for a user management system
var db = new WitDatabaseBuilder()
    .WithFilePath("users.witdb")
    .WithBTree()
    .WithPageSize(8192)        // 8KB pages for JSON user profiles
    .WithCacheSize(5000)       // Cache hot user data
    .WithTransactions()
    .Build();

var engine = new WitSqlEngine(db, ownsStore: true);

// Create table and indexes
engine.Execute(@"
    CREATE TABLE Users (
        Id BIGINT PRIMARY KEY,
        Email VARCHAR(255) UNIQUE,
        Profile TEXT,
        CreatedAt DATETIME DEFAULT NOW()
    )");
engine.Execute("CREATE INDEX IX_Users_CreatedAt ON Users(CreatedAt)");

// Point query - O(log n), very fast
var user = engine.QueryFirstOrDefault(
    "SELECT * FROM Users WHERE Id = @id",
    new Dictionary<string, object?> { ["id"] = 12345 });

// Range scan - efficient with linked leaf nodes
var recentUsers = engine.Query(
    "SELECT * FROM Users WHERE CreatedAt > @date ORDER BY CreatedAt",
    new Dictionary<string, object?> { ["date"] = DateTime.Today.AddDays(-7) });

3. LSM-Tree Engine

The LSM-Tree (Log-Structured Merge-Tree) engine is optimized for write-heavy workloads. It achieves extremely fast writes by buffering data in memory and writing sequentially to disk, at the cost of slightly slower reads.

How LSM-Tree Works

LSM-Tree separates write and read paths to optimize for writes:

[[Svg Src="./witdatabase-lsm-write-path.svg" Alt="Witdatabase lsm write path"]]

Write process:

Write is appended to WAL (for crash recovery)
Write is inserted into MemTable (in-memory)
When MemTable is full, it's flushed to disk as an SSTable
Background compaction merges SSTables to maintain read performance

Read process:

Check MemTable first (newest data)
Check Bloom filters of SSTables (skip files that definitely don't have the key)
Search SSTables from newest to oldest
First match wins (newer data shadows older)

Key Components

MemTable: An in-memory sorted data structure (typically a skip list or red-black tree) that buffers recent writes. Writes to MemTable are O(log n) but very fast since they're in memory.

SSTable (Sorted String Table): An immutable, sorted file containing key-value pairs. Once written, SSTables are never modified — they're only created by flushing MemTable or merging during compaction.

Write-Ahead Log (WAL): A sequential log of all writes. If the system crashes, the WAL is replayed to recover data that was in MemTable but not yet flushed.

Bloom Filter: A probabilistic data structure that quickly tells you if a key is definitely NOT in an SSTable. This dramatically reduces unnecessary disk reads.

Compaction: A background process that merges multiple SSTables into larger ones, removing duplicate and deleted keys. This reclaims space and maintains read performance.

Configuration

Configure the LSM-Tree engine for your workload:

// Via connection string
"Data Source=./logs;Store=lsm;MemTable Size=8388608;Block Cache Size=134217728"

// Via builder pattern
var db = new WitDatabaseBuilder()
    .WithLsmTree("./events", options =>
    {
        options.MemTableSizeLimit = 8 * 1024 * 1024;     // 8 MB MemTable
        options.EnableWal = true;                         // Enable WAL
        options.EnableBlockCache = true;                  // Enable read cache
        options.BlockCacheSizeBytes = 128 * 1024 * 1024; // 128 MB cache
        options.BackgroundCompaction = true;              // Auto-compact
        options.CompactionTrigger = 4;                    // Compact at 4 L0 files
    })
    .Build();

Configuration Parameters

Parameter	Default	Description
`MemTable Size`	4 MB	Size before flushing to disk. Larger = fewer flushes, more memory
`Block Cache Size`	64 MB	Read cache for SSTable blocks. Larger = faster reads
`Sync Writes`	true	Sync WAL on each write. Disable for speed (risk data loss)
`Background Compaction`	true	Automatic compaction. Disable for manual control
`LSM WAL`	true	Write-ahead logging. Disable only if durability isn't needed
`LSM Block Size`	4096	SSTable block size. Affects read granularity
`LSM Compaction Trigger`	4	Number of L0 files before compaction starts

Compaction

Compaction is the process of merging SSTables to maintain read performance and reclaim space:

[[Svg Src="./witdatabase-lsm-compaction.svg" Alt="Witdatabase lsm compaction"]]

What compaction does:

Merges overlapping key ranges
Removes deleted keys (tombstones)
Removes older versions of updated keys
Creates larger, more efficient files

Compaction trade-offs:

Uses CPU and disk I/O in the background
Temporarily uses extra disk space during merge
Can cause latency spikes if it falls behind

For most applications, leave Background Compaction = true. The engine handles scheduling automatically.

Bloom Filters

Bloom filters are a key optimization that makes LSM-Tree reads much faster:

Query: Get("user:12345")

Without Bloom Filter:
  SSTable-001: Read index, search... not found ✗
  SSTable-002: Read index, search... not found ✗
  SSTable-003: Read index, search... FOUND! ✓
  (3 disk operations)

With Bloom Filter:
  SSTable-001: Bloom says "definitely no" → skip
  SSTable-002: Bloom says "definitely no" → skip
  SSTable-003: Bloom says "maybe yes" → search... FOUND! ✓
  (1 disk operation)

Bloom filters are built automatically when SSTables are created. They add minimal overhead (~1% of SSTable size) but can eliminate 99%+ of unnecessary disk reads.

Performance Characteristics

The LSM-Tree engine excels at:

Sequential Writes: Writes go to memory (MemTable) and are flushed sequentially. No random I/O for writes means extremely high write throughput.

Bulk Inserts: Loading large amounts of data is very fast. The sequential nature of writes saturates disk bandwidth efficiently.

Write-Heavy Workloads: Systems that write much more than they read (logging, events, metrics) benefit greatly from LSM-Tree's design.

Space Reclamation: Compaction efficiently removes deleted and old data, which is important for time-series data with retention policies.

When LSM-Tree Excels

The LSM-Tree engine is the best choice for:

Logging and Event Systems: Applications that continuously write events, logs, or metrics. The write path is optimized for this pattern.

Time-Series Data: IoT sensors, application metrics, financial tick data. LSM-Tree handles the continuous stream of writes efficiently.

Event Sourcing: Systems that store every change as an immutable event. LSM-Tree's append-only nature aligns well with this pattern.

Bulk Data Loading: ETL jobs, data migrations, or any scenario where you're loading large amounts of data quickly.

Practical Example

// Create an LSM-Tree database for an event logging system
var db = new WitDatabaseBuilder()
    .WithLsmTree("./event-store", options =>
    {
        options.MemTableSizeLimit = 16 * 1024 * 1024;    // 16 MB - larger for high throughput
        options.BlockCacheSizeBytes = 256 * 1024 * 1024; // 256 MB cache
        options.BackgroundCompaction = true;
        options.EnableWal = true;
    })
    .Build();

var engine = new WitSqlEngine(db, ownsStore: true);

// Create event table
engine.Execute(@"
    CREATE TABLE Events (
        Id BIGINT PRIMARY KEY,
        Timestamp DATETIME NOT NULL,
        EventType VARCHAR(50),
        Payload TEXT,
        INDEX IX_Events_Timestamp (Timestamp)
    )");

// High-throughput event insertion
engine.Execute("BEGIN TRANSACTION");

var stmt = engine.Prepare(
    "INSERT INTO Events (Id, Timestamp, EventType, Payload) VALUES (@id, @ts, @type, @payload)");

for (int i = 0; i < 100000; i++)
{
    stmt.SetParameters(new Dictionary<string, object?>
    {
        ["id"] = i,
        ["ts"] = DateTime.UtcNow,
        ["type"] = "user.action",
        ["payload"] = 
        
        
            
            Loading...
        quot;{{\"action\": \"click\", \"target\": \"button-{i}\"}}"
    });
    stmt.Execute();
}

engine.Execute("COMMIT");

// Time-range query (benefits from Bloom filters)
var recentEvents = engine.Query(
    "SELECT * FROM Events WHERE Timestamp > @start ORDER BY Timestamp",
    new Dictionary<string, object?> { ["start"] = DateTime.UtcNow.AddHours(-1) });

LSM-Tree vs B+Tree Trade-offs

Aspect	LSM-Tree	B+Tree
Write latency	Lower (memory + sequential)	Higher (random I/O)
Read latency	Higher (multiple files)	Lower (single tree)
Write throughput	Much higher	Moderate
Read throughput	Good	Excellent
Space amplification	Higher (until compaction)	Lower
Write amplification	Lower	Higher
Predictability	Less (compaction spikes)	More consistent
Storage format	Directory with files	Single file

4. In-Memory Engine

The In-Memory engine stores all data in RAM with no disk persistence. It provides the fastest possible performance for scenarios where durability isn't required.

How In-Memory Works

The In-Memory engine is straightforward: data lives in RAM using efficient data structures, with no disk I/O:

[[Svg Src="./witdatabase-inmemory-store.svg" Alt="Witdatabase Inmemory Store"]]

Key characteristics:

All operations are in-memory — no disk latency
Data is lost when the database is closed or the process exits
Memory usage equals data size plus overhead
Supports the same API as persistent engines

Configuration

The In-Memory engine requires minimal configuration:

// Via connection string
"Data Source=:memory:"

// Via builder pattern (simple)
var db = WitDatabase.CreateInMemory();

// Via builder pattern (with options)
var db = new WitDatabaseBuilder()
    .WithMemoryStorage()
    .WithBTree()           // Use B+Tree structure in memory
    .WithTransactions()    // Transactions still work
    .Build();

You can also combine In-Memory with encryption for secure temporary storage:

// Encrypted in-memory database
var db = WitDatabase.CreateInMemory("my-password");

// Or via builder
var db = new WitDatabaseBuilder()
    .WithMemoryStorage()
    .WithBTree()
    .WithEncryption("my-password")
    .Build();

Performance Characteristics

The In-Memory engine provides the fastest possible operations:

Operation	Typical Latency	Notes
Point read (Get)	< 1 µs	Direct memory access
Point write (Put)	< 1 µs	No disk sync
Range scan	< 1 µs per key	Sequential memory access
Bulk insert (100K)	< 100 ms	Memory bandwidth limited

Comparison with persistent engines (relative performance):

Operation	In-Memory	B+Tree (File)	LSM-Tree
Random reads	1x (baseline)	10-100x slower	20-200x slower
Random writes	1x (baseline)	50-500x slower	5-50x slower
Sequential scan	1x (baseline)	5-20x slower	10-50x slower

Note: Actual performance depends on whether file-based engines have data cached.

Use Cases

The In-Memory engine is ideal for:

Unit Testing: Tests run faster without disk I/O, and each test gets a fresh database automatically.

[Test]
public void TestUserCreation()
{
    // Fresh in-memory database for each test
    using var db = WitDatabase.CreateInMemory();
    using var engine = new WitSqlEngine(db, ownsStore: true);
    
    engine.Execute("CREATE TABLE Users (Id INT PRIMARY KEY, Name VARCHAR(100))");
    engine.Execute("INSERT INTO Users VALUES (1, 'Test User')");
    
    var user = engine.QueryFirstOrDefault("SELECT * FROM Users WHERE Id = 1");
    Assert.IsNotNull(user);
    Assert.AreEqual("Test User", user["Name"].AsString());
}
// Database automatically disposed after test

Integration Testing: Test your data access layer with a real database engine but without filesystem dependencies.

public class TestFixture
{
    private WitDatabase _db;
    private WitSqlEngine _engine;
    
    [SetUp]
    public void Setup()
    {
        _db = WitDatabase.CreateInMemory();
        _engine = new WitSqlEngine(_db, ownsStore: false);
        
        // Apply schema
        _engine.Execute(File.ReadAllText("schema.sql"));
        
        // Load test data
        _engine.Execute(File.ReadAllText("test-data.sql"));
    }
    
    [TearDown]
    public void Teardown()
    {
        _engine.Dispose();
        _db.Dispose();
    }
}

Caching Layer: Use as a high-performance cache with SQL query capabilities.

public class SqlCache
{
    private readonly WitSqlEngine _engine;
    
    public SqlCache()
    {
        var db = WitDatabase.CreateInMemory();
        _engine = new WitSqlEngine(db, ownsStore: true);
        
        _engine.Execute(@"
            CREATE TABLE Cache (
                Key VARCHAR(255) PRIMARY KEY,
                Value TEXT,
                ExpiresAt DATETIME
            )");
    }
    
    public void Set(string key, string value, TimeSpan ttl)
    {
        _engine.ExecuteNonQuery(@"
            INSERT OR REPLACE INTO Cache (Key, Value, ExpiresAt) 
            VALUES (@key, @value, @expires)",
            new Dictionary<string, object?>
            {
                ["key"] = key,
                ["value"] = value,
                ["expires"] = DateTime.UtcNow.Add(ttl)
            });
    }
    
    public string? Get(string key)
    {
        var row = _engine.QueryFirstOrDefault(@"
            SELECT Value FROM Cache 
            WHERE Key = @key AND ExpiresAt > @now",
            new Dictionary<string, object?>
            {
                ["key"] = key,
                ["now"] = DateTime.UtcNow
            });
        
        return row?["Value"].AsStringOrNull();
    }
}

Temporary Processing: Hold intermediate results during complex operations.

public class ReportGenerator
{
    public Report GenerateReport(IEnumerable<RawData> data)
    {
        using var db = WitDatabase.CreateInMemory();
        using var engine = new WitSqlEngine(db, ownsStore: true);
        
        // Create temp tables
        engine.Execute("CREATE TABLE TempData (...)");
        engine.Execute("CREATE TABLE TempAggregates (...)");
        
        // Load and transform data
        var stmt = engine.Prepare("INSERT INTO TempData ...");
        foreach (var item in data)
        {
            stmt.Execute(/* params */);
        }
        
        // Run complex aggregations with SQL
        engine.Execute(@"
            INSERT INTO TempAggregates
            SELECT Category, SUM(Amount), AVG(Price)
            FROM TempData
            GROUP BY Category");
        
        // Build report from aggregates
        var results = engine.Query("SELECT * FROM TempAggregates");
        return BuildReport(results);
    }
    // In-memory database disposed, memory freed
}

Memory Considerations

When using the In-Memory engine, keep in mind:

Memory Usage: Data plus overhead (indexes, structure). A database with 1 GB of raw data might use 1.2-1.5 GB of memory.

No Persistence: All data is lost on dispose. If you need to preserve data, export it first or use a persistent engine.

Garbage Collection: Large in-memory databases can impact .NET garbage collection. Consider using GC.Collect() after disposing if memory pressure is a concern.

Concurrency: In-memory operations are fast, but thread safety rules still apply. Use transactions for consistency.

Limitations

The In-Memory engine has some limitations compared to persistent engines:

Feature	In-Memory	Persistent (B+Tree/LSM)
Data survives restart	❌ No	✅ Yes
Crash recovery	❌ No	✅ Yes
Data larger than RAM	❌ No	✅ Yes
Multiple processes	❌ No	✅ Yes (file locking)
Backup/Restore	❌ Not directly	✅ Yes

5. Choosing the Right Engine

Selecting the right storage engine is one of the most important decisions for your application's performance. This section provides a decision framework to help you choose.

Decision Flowchart

[[Svg Src="./witdatabase-storage-engine-decision.svg" Alt="Witdatabase Storage Engine Decision"]]

Workload Patterns

Different workloads have different characteristics. Match your workload to the right engine:

Workload Pattern	Description	Recommended Engine
OLTP	Many small transactions, mixed reads/writes	B+Tree
Logging	Continuous writes, occasional reads	LSM-Tree
Time-series	Timestamped data, range queries	LSM-Tree
Key-value cache	Fast lookups, moderate updates	B+Tree or In-Memory
Session storage	Short-lived data, fast access	In-Memory
Event sourcing	Append-only writes	LSM-Tree
Analytics	Large scans, complex queries	B+Tree
Testing	Isolated, fast, disposable	In-Memory

Real-World Scenarios

Here are concrete recommendations for common application types:

Web Application (E-commerce, CRM, SaaS)

Choose: B+Tree

Rationale: Mixed read/write workload, need for consistent performance, transactional integrity. Users expect fast page loads (reads) and reliable checkout/updates (writes).

var db = new WitDatabaseBuilder()
    .WithFilePath("app.witdb")
    .WithBTree()
    .WithCacheSize(5000)      // Large cache for product catalog
    .WithTransactions()
    .WithMvcc()               // Handle concurrent users
    .Build();

Logging Service

Choose: LSM-Tree

Rationale: Write-heavy workload (logs arriving constantly), sequential writes, reads are less frequent (when investigating issues).

var db = new WitDatabaseBuilder()
    .WithLsmTree("./logs", options =>
    {
        options.MemTableSizeLimit = 32 * 1024 * 1024;  // 32 MB buffer
        options.BackgroundCompaction = true;
    })
    .Build();

IoT Data Collection

Choose: LSM-Tree

Rationale: Continuous sensor data streams, very high write throughput, time-range queries for analysis.

var db = new WitDatabaseBuilder()
    .WithLsmTree("./sensor-data", options =>
    {
        options.MemTableSizeLimit = 64 * 1024 * 1024;   // Large buffer
        options.BlockCacheSizeBytes = 256 * 1024 * 1024; // Cache for queries
    })
    .Build();

Desktop Application (Note-taking, Document Management)

Choose: B+Tree

Rationale: Single user, mixed operations, need for single-file deployment, reliable storage.

var appData = Environment.GetFolderPath(Environment.SpecialFolder.LocalApplicationData);
var db = new WitDatabaseBuilder()
    .WithFilePath(Path.Combine(appData, "MyApp", "data.witdb"))
    .WithBTree()
    .WithEncryption(userPassword)
    .Build();

Test Suite

Choose: In-Memory

Rationale: Speed, isolation between tests, no cleanup needed, no filesystem dependencies.

[TestFixture]
public class RepositoryTests
{
    [Test]
    public async Task CreateUser_ShouldPersist()
    {
        await using var db = WitDatabase.CreateInMemory();
        var repo = new UserRepository(db);
        
        var user = await repo.CreateAsync("test@example.com");
        
        Assert.IsNotNull(user.Id);
    }
}

Real-time Analytics Dashboard

Choose: B+Tree with large cache

Rationale: Read-heavy (dashboard queries), periodic writes (data imports), complex aggregations benefit from consistent read performance.

var db = new WitDatabaseBuilder()
    .WithFilePath("analytics.witdb")
    .WithBTree()
    .WithCacheSize(20000)     // Very large cache
    .WithPageSize(8192)       // Larger pages for scans
    .Build();

Migration Between Engines

You can change storage engines, but it requires data migration:

// Export from B+Tree
using var sourceDb = WitDatabase.Open("source-btree.witdb");
using var sourceEngine = new WitSqlEngine(sourceDb, ownsStore: false);

// Create LSM-Tree destination
using var destDb = new WitDatabaseBuilder()
    .WithLsmTree("./dest-lsm")
    .Build();
using var destEngine = new WitSqlEngine(destDb, ownsStore: false);

// Copy schema
var tables = sourceEngine.Query("SELECT name FROM sqlite_master WHERE type='table'");
foreach (var table in tables)
{
    var tableName = table["name"].AsString();
    var createSql = sourceEngine.ExecuteScalar(
        
        
        
            
            Loading...
        quot;SELECT sql FROM sqlite_master WHERE name = '{tableName}'").AsString();
    destEngine.Execute(createSql);
}

// Copy data table by table
foreach (var table in tables)
{
    var tableName = table["name"].AsString();
    var rows = sourceEngine.Query(
        
        
            
            Loading...
        quot;SELECT * FROM {tableName}");
    
    if (rows.Count > 0)
    {
        var columns = rows[0].ColumnNames;
        var insertSql = 
        
        
            
            Loading...
        
quot;INSERT INTO {tableName} ({string.Join(", ", columns)}) VALUES ({string.Join(", ", columns.Select(c => 
        
        
            
            Loading...
        quot;@{c}"))})";
        
        using var stmt = destEngine.Prepare(insertSql);
        foreach (var row in rows)
        {
            var parameters = columns.ToDictionary(c => c, c => row[c].Value);
            stmt.Execute(parameters);
        }
    }
}

Hybrid Approaches

Sometimes you need different engines for different data:

// Main application data (B+Tree)
var mainDb = new WitDatabaseBuilder()
    .WithFilePath("app.witdb")
    .WithBTree()
    .Build();

// Audit log (LSM-Tree)
var auditDb = new WitDatabaseBuilder()
    .WithLsmTree("./audit-logs")
    .Build();

// Session cache (In-Memory)
var sessionDb = WitDatabase.CreateInMemory();

This hybrid approach gives you:

Fast, consistent access to main data (B+Tree)
High-throughput audit logging (LSM-Tree)
Ultra-fast session lookups (In-Memory)

6. Performance Comparison

Understanding the performance characteristics of each engine helps you set realistic expectations and optimize your application.

Benchmark Overview

These benchmarks were run on typical hardware (Intel Core i7, NVMe SSD, 16GB RAM). Your results may vary based on hardware, data patterns, and configuration.

Write Performance

Writing 100,000 key-value pairs (each ~100 bytes):

Engine	Time	Throughput	Notes
In-Memory	45 ms	2.2M ops/sec	Limited by CPU/memory bandwidth
LSM-Tree	180 ms	555K ops/sec	Sequential writes to MemTable
B+Tree	850 ms	117K ops/sec	Random I/O, page splits

With WAL disabled (LSM-Tree) or transactions disabled (B+Tree), write performance improves but durability is lost:

Engine	With Durability	Without Durability
LSM-Tree	180 ms	120 ms
B+Tree	850 ms	350 ms

Read Performance

Reading 100,000 random keys (after data is loaded):

Engine	Cold Cache	Warm Cache	Notes
In-Memory	N/A	25 ms	Always "warm"
B+Tree	450 ms	80 ms	Single tree traversal
LSM-Tree	650 ms	150 ms	Multiple files, Bloom filters help

The "cold cache" scenario is relevant when data doesn't fit in memory. With warm cache, B+Tree approaches In-Memory performance.

Range Scan Performance

Scanning 10,000 consecutive keys:

Engine	Time	Notes
In-Memory	5 ms	Sequential memory access
B+Tree	15 ms	Linked leaf nodes
LSM-Tree	35 ms	Merge across SSTables

B+Tree excels at range scans because leaf nodes are linked. LSM-Tree must merge results from multiple files.

Bulk Insert Performance

Inserting 1 million records in a single transaction:

Engine	Time	Throughput
In-Memory	2.1 s	476K ops/sec
LSM-Tree	4.5 s	222K ops/sec
B+Tree	18 s	55K ops/sec

LSM-Tree's sequential write pattern shines for bulk loads. B+Tree's page splits become a bottleneck.

Memory Usage

Memory consumption for 1 million key-value pairs (~100 bytes each):

Engine	Data on Disk	Memory Usage	Notes
In-Memory	N/A	~150 MB	Data + structure overhead
B+Tree	~120 MB	8-40 MB	Cache size configurable
LSM-Tree	~130 MB	70-150 MB	MemTable + Block Cache

Summary Table

Metric	B+Tree	LSM-Tree	In-Memory
Random write	⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐
Sequential write	⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐
Random read	⭐⭐⭐⭐⭐	⭐⭐⭐	⭐⭐⭐⭐⭐
Range scan	⭐⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐⭐⭐
Memory efficiency	⭐⭐⭐⭐⭐	⭐⭐⭐	⭐⭐
Durability	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	❌
Predictability	⭐⭐⭐⭐⭐	⭐⭐⭐	⭐⭐⭐⭐⭐

7. Quick Reference

Engine Selection

Scenario	Engine	Reason
General purpose	B+Tree	Balanced performance
Write-heavy (>80% writes)	LSM-Tree	Optimized for writes
Read-heavy (>80% reads)	B+Tree	Fast point queries
Logging / Events	LSM-Tree	Sequential writes
Testing	In-Memory	Speed, isolation
Caching	In-Memory or B+Tree	Depends on persistence needs
Time-series	LSM-Tree	Continuous writes
OLTP	B+Tree	Mixed workload, consistency

Configuration Quick Reference

B+Tree Configuration:

// Connection string
"Data Source=app.witdb;Store=btree;Page Size=4096;Cache Size=2000"

// Builder pattern
var db = new WitDatabaseBuilder()
    .WithFilePath("app.witdb")
    .WithBTree()
    .WithPageSize(4096)
    .WithCacheSize(2000)
    .Build();

LSM-Tree Configuration:

// Connection string
"Data Source=./data;Store=lsm;MemTable Size=8388608;Block Cache Size=134217728"

// Builder pattern
var db = new WitDatabaseBuilder()
    .WithLsmTree("./data", options =>
    {
        options.MemTableSizeLimit = 8 * 1024 * 1024;
        options.BlockCacheSizeBytes = 128 * 1024 * 1024;
        options.BackgroundCompaction = true;
    })
    .Build();

In-Memory Configuration:

// Connection string
"Data Source=:memory:"

// Builder pattern
var db = WitDatabase.CreateInMemory();

// With encryption
var db = WitDatabase.CreateInMemory("password");

Parameter Reference

B+Tree Parameters:

Parameter	Default	Description
`Page Size`	4096	Page size in bytes (4096-65536)
`Cache Size`	1000	Number of pages to cache
`Cache`	clock	Cache algorithm (clock, lru)

LSM-Tree Parameters:

Parameter	Default	Description
`MemTable Size`	4 MB	MemTable size before flush
`Block Cache Size`	64 MB	Read cache size
`Sync Writes`	true	Sync WAL on each write
`Background Compaction`	true	Auto-compact SSTables
`LSM WAL`	true	Enable write-ahead log
`LSM Block Size`	4096	SSTable block size
`LSM Compaction Trigger`	4	L0 files before compaction

Storage Format

Engine	Format	Files
B+Tree	Single file	`app.witdb`
LSM-Tree	Directory	`./data/` with `.sst`, `.wal` files
In-Memory	None	No files

Feature Support

Feature	B+Tree	LSM-Tree	In-Memory
Transactions	✅	✅	✅
MVCC	✅	✅	✅
Encryption	✅	✅	✅
Crash recovery	✅	✅	❌
File locking	✅	✅	N/A
Blazor WASM	✅	❌	✅