WitDatabase provides three storage engines optimized for different workloads. Choosing the right engine for your application can significantly impact performance. This guide explains how each engine works, when to use it, and how to configure it for optimal results.
1. Overview
What is a Storage Engine?
A storage engine is the component responsible for physically storing and retrieving data on disk (or in memory). It determines how keys and values are organized, how quickly operations complete, and how much disk space is used.
WitDatabase's architecture separates the storage engine from higher layers (SQL, transactions, encryption), allowing you to choose the best engine for your workload without changing your application code.
Available Engines
WitDatabase offers three storage engines:
| Engine | Implementation | Optimized For | Storage Format |
|---|---|---|---|
| B+Tree | StoreBTree |
General use, reads | Single file |
| LSM-Tree | StoreLsm |
Write-heavy workloads | Directory with multiple files |
| In-Memory | StoreInMemory |
Testing, caching | RAM only |
Architecture
All storage engines implement the same IKeyValueStore interface, making them interchangeable:
[[Svg Src="./witdatabase-ikeyvaluestore-engines.svg" Alt="Witdatabase Ikeyvaluestore Engines"]]
Key Characteristics Comparison
Before diving into details, here's a high-level comparison:
| Characteristic | B+Tree | LSM-Tree | In-Memory |
|---|---|---|---|
| Read Performance | Excellent | Good | Fastest |
| Write Performance | Good | Excellent | Fastest |
| Range Scan | Excellent | Very Good | Excellent |
| Space Efficiency | Very Good | Good (needs compaction) | N/A |
| Memory Usage | Lower | Higher | Highest |
| Durability | Yes | Yes | No |
| Crash Recovery | Yes | Yes | No |
| Concurrency | Good | Very Good | Excellent |
Quick Selection Guide
Choose your engine based on your primary workload:
Choose B+Tree when:
- Your application has mixed read/write patterns
- You need predictable, consistent performance
- Random access (point queries) is common
- You prefer simple single-file storage
- This is a general-purpose application
Choose LSM-Tree when:
- Writes significantly outnumber reads
- You're building a logging or event system
- Data arrives in streams or batches
- You can tolerate slightly slower reads for much faster writes
- Background processing (compaction) is acceptable
Choose In-Memory when:
- Data doesn't need to survive application restart
- You're running unit tests
- You need a temporary cache or scratchpad
- Maximum speed is required and data fits in RAM
2. B+Tree Engine
The B+Tree engine is the default and recommended choice for most applications. It provides excellent read performance with good write performance, making it ideal for general-purpose workloads.
How B+Tree Works
A B+Tree is a self-balancing tree structure where all data is stored in leaf nodes, and internal nodes contain only keys for navigation. This design provides efficient lookups, insertions, and range scans.
[[Svg Src="./witdatabase-btree-structure.svg" Alt="Witdatabase Btree Structure"]]
Key properties of B+Tree:
- All values are stored in leaf nodes
- Leaf nodes are linked for fast sequential access
- Tree is always balanced (all leaves at same depth)
- Logarithmic time complexity: O(log n) for all operations
Page-Based Storage
The B+Tree engine organizes data into fixed-size pages (default 4KB):
| Page Type | Purpose |
|---|---|
| Header Page | Database metadata, root page pointer |
| Internal Pages | Navigation keys, child pointers |
| Leaf Pages | Actual key-value data |
| Overflow Pages | Large values that don't fit in leaf pages |
When a page fills up, it splits into two pages and the tree rebalances. This keeps the tree height minimal (typically 3-4 levels for millions of records).
Configuration
Configure the B+Tree engine via connection string or builder pattern:
// Via connection string
"Data Source=app.witdb;Store=btree;Page Size=4096;Cache Size=2000"
// Via builder pattern
var db = new WitDatabaseBuilder()
.WithFilePath("app.witdb")
.WithBTree()
.WithPageSize(4096) // 4KB pages (default)
.WithCacheSize(2000) // 2000 pages in cache
.Build();
Configuration Parameters
| Parameter | Default | Range | Description |
|---|---|---|---|
Page Size |
4096 | 4096-65536 | Page size in bytes. Larger pages can improve sequential read performance but use more memory. |
Cache Size |
1000 | 1-100000 | Number of pages to cache in memory. More cache = faster reads, more memory. |
Cache |
clock | clock, lru | Cache eviction algorithm. Clock is better for concurrent access. |
Page Size Selection
Choosing the right page size affects performance:
| Page Size | Best For | Trade-offs |
|---|---|---|
| 4 KB | General use, small values | Default, balanced performance |
| 8 KB | Medium-sized values | Good for documents, JSON |
| 16 KB | Large values, sequential access | Higher memory per page |
| 32-64 KB | Very large values, bulk reads | Significant memory overhead |
As a rule of thumb, choose a page size where most of your values fit within a single page. Values larger than the page size use overflow pages, which require additional I/O.
Performance Characteristics
The B+Tree engine excels at:
Point Queries (Get by key): The tree structure ensures O(log n) lookups regardless of database size. With a million records and a typical tree height of 4, finding any key requires at most 4 page reads.
Range Scans: Leaf nodes are linked together, so scanning a range of keys is very efficient after finding the start position. This makes the B+Tree ideal for queries like "all orders from last week" or "users with names starting with 'A'".
Random Writes: While not as fast as LSM-Tree for sequential writes, B+Tree handles random inserts well. Each insert requires O(log n) operations to find the correct position and potentially split pages.
Space Efficiency: B+Tree uses space efficiently because data is stored in-place. There's no write amplification from compaction like in LSM-Tree. Deleted space is reused immediately.
When B+Tree Excels
The B+Tree engine is the best choice for:
OLTP Workloads: Applications with many small transactions that read and write individual records. Think e-commerce, CRM, or user management systems.
Read-Heavy Applications: When reads significantly outnumber writes. The balanced tree structure ensures consistent read performance.
Applications Requiring Predictable Performance: B+Tree provides consistent latency without the occasional spikes that can occur during LSM-Tree compaction.
Single-File Deployment: The entire database is contained in one file, simplifying deployment and backup.
Practical Example
// Create a B+Tree database optimized for a user management system
var db = new WitDatabaseBuilder()
.WithFilePath("users.witdb")
.WithBTree()
.WithPageSize(8192) // 8KB pages for JSON user profiles
.WithCacheSize(5000) // Cache hot user data
.WithTransactions()
.Build();
var engine = new WitSqlEngine(db, ownsStore: true);
// Create table and indexes
engine.Execute(@"
CREATE TABLE Users (
Id BIGINT PRIMARY KEY,
Email VARCHAR(255) UNIQUE,
Profile TEXT,
CreatedAt DATETIME DEFAULT NOW()
)");
engine.Execute("CREATE INDEX IX_Users_CreatedAt ON Users(CreatedAt)");
// Point query - O(log n), very fast
var user = engine.QueryFirstOrDefault(
"SELECT * FROM Users WHERE Id = @id",
new Dictionary<string, object?> { ["id"] = 12345 });
// Range scan - efficient with linked leaf nodes
var recentUsers = engine.Query(
"SELECT * FROM Users WHERE CreatedAt > @date ORDER BY CreatedAt",
new Dictionary<string, object?> { ["date"] = DateTime.Today.AddDays(-7) });
3. LSM-Tree Engine
The LSM-Tree (Log-Structured Merge-Tree) engine is optimized for write-heavy workloads. It achieves extremely fast writes by buffering data in memory and writing sequentially to disk, at the cost of slightly slower reads.
How LSM-Tree Works
LSM-Tree separates write and read paths to optimize for writes:
[[Svg Src="./witdatabase-lsm-write-path.svg" Alt="Witdatabase lsm write path"]]
Write process:
- Write is appended to WAL (for crash recovery)
- Write is inserted into MemTable (in-memory)
- When MemTable is full, it's flushed to disk as an SSTable
- Background compaction merges SSTables to maintain read performance
Read process:
- Check MemTable first (newest data)
- Check Bloom filters of SSTables (skip files that definitely don't have the key)
- Search SSTables from newest to oldest
- First match wins (newer data shadows older)
Key Components
MemTable: An in-memory sorted data structure (typically a skip list or red-black tree) that buffers recent writes. Writes to MemTable are O(log n) but very fast since they're in memory.
SSTable (Sorted String Table): An immutable, sorted file containing key-value pairs. Once written, SSTables are never modified — they're only created by flushing MemTable or merging during compaction.
Write-Ahead Log (WAL): A sequential log of all writes. If the system crashes, the WAL is replayed to recover data that was in MemTable but not yet flushed.
Bloom Filter: A probabilistic data structure that quickly tells you if a key is definitely NOT in an SSTable. This dramatically reduces unnecessary disk reads.
Compaction: A background process that merges multiple SSTables into larger ones, removing duplicate and deleted keys. This reclaims space and maintains read performance.
Configuration
Configure the LSM-Tree engine for your workload:
// Via connection string
"Data Source=./logs;Store=lsm;MemTable Size=8388608;Block Cache Size=134217728"
// Via builder pattern
var db = new WitDatabaseBuilder()
.WithLsmTree("./events", options =>
{
options.MemTableSizeLimit = 8 * 1024 * 1024; // 8 MB MemTable
options.EnableWal = true; // Enable WAL
options.EnableBlockCache = true; // Enable read cache
options.BlockCacheSizeBytes = 128 * 1024 * 1024; // 128 MB cache
options.BackgroundCompaction = true; // Auto-compact
options.CompactionTrigger = 4; // Compact at 4 L0 files
})
.Build();
Configuration Parameters
| Parameter | Default | Description |
|---|---|---|
MemTable Size |
4 MB | Size before flushing to disk. Larger = fewer flushes, more memory |
Block Cache Size |
64 MB | Read cache for SSTable blocks. Larger = faster reads |
Sync Writes |
true | Sync WAL on each write. Disable for speed (risk data loss) |
Background Compaction |
true | Automatic compaction. Disable for manual control |
LSM WAL |
true | Write-ahead logging. Disable only if durability isn't needed |
LSM Block Size |
4096 | SSTable block size. Affects read granularity |
LSM Compaction Trigger |
4 | Number of L0 files before compaction starts |
Compaction
Compaction is the process of merging SSTables to maintain read performance and reclaim space:
[[Svg Src="./witdatabase-lsm-compaction.svg" Alt="Witdatabase lsm compaction"]]
What compaction does:
- Merges overlapping key ranges
- Removes deleted keys (tombstones)
- Removes older versions of updated keys
- Creates larger, more efficient files
Compaction trade-offs:
- Uses CPU and disk I/O in the background
- Temporarily uses extra disk space during merge
- Can cause latency spikes if it falls behind
For most applications, leave Background Compaction = true. The engine handles scheduling automatically.
Bloom Filters
Bloom filters are a key optimization that makes LSM-Tree reads much faster:
Query: Get("user:12345")
Without Bloom Filter:
SSTable-001: Read index, search... not found ✗
SSTable-002: Read index, search... not found ✗
SSTable-003: Read index, search... FOUND! ✓
(3 disk operations)
With Bloom Filter:
SSTable-001: Bloom says "definitely no" → skip
SSTable-002: Bloom says "definitely no" → skip
SSTable-003: Bloom says "maybe yes" → search... FOUND! ✓
(1 disk operation)
Bloom filters are built automatically when SSTables are created. They add minimal overhead (~1% of SSTable size) but can eliminate 99%+ of unnecessary disk reads.
Performance Characteristics
The LSM-Tree engine excels at:
Sequential Writes: Writes go to memory (MemTable) and are flushed sequentially. No random I/O for writes means extremely high write throughput.
Bulk Inserts: Loading large amounts of data is very fast. The sequential nature of writes saturates disk bandwidth efficiently.
Write-Heavy Workloads: Systems that write much more than they read (logging, events, metrics) benefit greatly from LSM-Tree's design.
Space Reclamation: Compaction efficiently removes deleted and old data, which is important for time-series data with retention policies.
When LSM-Tree Excels
The LSM-Tree engine is the best choice for:
Logging and Event Systems: Applications that continuously write events, logs, or metrics. The write path is optimized for this pattern.
Time-Series Data: IoT sensors, application metrics, financial tick data. LSM-Tree handles the continuous stream of writes efficiently.
Event Sourcing: Systems that store every change as an immutable event. LSM-Tree's append-only nature aligns well with this pattern.
Bulk Data Loading: ETL jobs, data migrations, or any scenario where you're loading large amounts of data quickly.
Practical Example
// Create an LSM-Tree database for an event logging system
var db = new WitDatabaseBuilder()
.WithLsmTree("./event-store", options =>
{
options.MemTableSizeLimit = 16 * 1024 * 1024; // 16 MB - larger for high throughput
options.BlockCacheSizeBytes = 256 * 1024 * 1024; // 256 MB cache
options.BackgroundCompaction = true;
options.EnableWal = true;
})
.Build();
var engine = new WitSqlEngine(db, ownsStore: true);
// Create event table
engine.Execute(@"
CREATE TABLE Events (
Id BIGINT PRIMARY KEY,
Timestamp DATETIME NOT NULL,
EventType VARCHAR(50),
Payload TEXT,
INDEX IX_Events_Timestamp (Timestamp)
)");
// High-throughput event insertion
engine.Execute("BEGIN TRANSACTION");
var stmt = engine.Prepare(
"INSERT INTO Events (Id, Timestamp, EventType, Payload) VALUES (@id, @ts, @type, @payload)");
for (int i = 0; i < 100000; i++)
{
stmt.SetParameters(new Dictionary<string, object?>
{
["id"] = i,
["ts"] = DateTime.UtcNow,
["type"] = "user.action",
["payload"] =
Loading...
quot;{{\"action\": \"click\", \"target\": \"button-{i}\"}}"
});
stmt.Execute();
}
engine.Execute("COMMIT");
// Time-range query (benefits from Bloom filters)
var recentEvents = engine.Query(
"SELECT * FROM Events WHERE Timestamp > @start ORDER BY Timestamp",
new Dictionary<string, object?> { ["start"] = DateTime.UtcNow.AddHours(-1) });
LSM-Tree vs B+Tree Trade-offs
Aspect
LSM-Tree
B+Tree
Write latency
Lower (memory + sequential)
Higher (random I/O)
Read latency
Higher (multiple files)
Lower (single tree)
Write throughput
Much higher
Moderate
Read throughput
Good
Excellent
Space amplification
Higher (until compaction)
Lower
Write amplification
Lower
Higher
Predictability
Less (compaction spikes)
More consistent
Storage format
Directory with files
Single file
4. In-Memory Engine
The In-Memory engine stores all data in RAM with no disk persistence. It provides the fastest possible performance for scenarios where durability isn't required.
How In-Memory Works
The In-Memory engine is straightforward: data lives in RAM using efficient data structures, with no disk I/O:
[[Svg Src="./witdatabase-inmemory-store.svg" Alt="Witdatabase Inmemory Store"]]
Key characteristics:
- All operations are in-memory — no disk latency
- Data is lost when the database is closed or the process exits
- Memory usage equals data size plus overhead
- Supports the same API as persistent engines
Configuration
The In-Memory engine requires minimal configuration:
// Via connection string
"Data Source=:memory:"
// Via builder pattern (simple)
var db = WitDatabase.CreateInMemory();
// Via builder pattern (with options)
var db = new WitDatabaseBuilder()
.WithMemoryStorage()
.WithBTree() // Use B+Tree structure in memory
.WithTransactions() // Transactions still work
.Build();
You can also combine In-Memory with encryption for secure temporary storage:
// Encrypted in-memory database
var db = WitDatabase.CreateInMemory("my-password");
// Or via builder
var db = new WitDatabaseBuilder()
.WithMemoryStorage()
.WithBTree()
.WithEncryption("my-password")
.Build();
Performance Characteristics
The In-Memory engine provides the fastest possible operations:
Operation
Typical Latency
Notes
Point read (Get)
< 1 µs
Direct memory access
Point write (Put)
< 1 µs
No disk sync
Range scan
< 1 µs per key
Sequential memory access
Bulk insert (100K)
< 100 ms
Memory bandwidth limited
Comparison with persistent engines (relative performance):
Operation
In-Memory
B+Tree (File)
LSM-Tree
Random reads
1x (baseline)
10-100x slower
20-200x slower
Random writes
1x (baseline)
50-500x slower
5-50x slower
Sequential scan
1x (baseline)
5-20x slower
10-50x slower
Note: Actual performance depends on whether file-based engines have data cached.
Use Cases
The In-Memory engine is ideal for:
Unit Testing: Tests run faster without disk I/O, and each test gets a fresh database automatically.
[Test]
public void TestUserCreation()
{
// Fresh in-memory database for each test
using var db = WitDatabase.CreateInMemory();
using var engine = new WitSqlEngine(db, ownsStore: true);
engine.Execute("CREATE TABLE Users (Id INT PRIMARY KEY, Name VARCHAR(100))");
engine.Execute("INSERT INTO Users VALUES (1, 'Test User')");
var user = engine.QueryFirstOrDefault("SELECT * FROM Users WHERE Id = 1");
Assert.IsNotNull(user);
Assert.AreEqual("Test User", user["Name"].AsString());
}
// Database automatically disposed after test
Integration Testing: Test your data access layer with a real database engine but without filesystem dependencies.
public class TestFixture
{
private WitDatabase _db;
private WitSqlEngine _engine;
[SetUp]
public void Setup()
{
_db = WitDatabase.CreateInMemory();
_engine = new WitSqlEngine(_db, ownsStore: false);
// Apply schema
_engine.Execute(File.ReadAllText("schema.sql"));
// Load test data
_engine.Execute(File.ReadAllText("test-data.sql"));
}
[TearDown]
public void Teardown()
{
_engine.Dispose();
_db.Dispose();
}
}
Caching Layer: Use as a high-performance cache with SQL query capabilities.
public class SqlCache
{
private readonly WitSqlEngine _engine;
public SqlCache()
{
var db = WitDatabase.CreateInMemory();
_engine = new WitSqlEngine(db, ownsStore: true);
_engine.Execute(@"
CREATE TABLE Cache (
Key VARCHAR(255) PRIMARY KEY,
Value TEXT,
ExpiresAt DATETIME
)");
}
public void Set(string key, string value, TimeSpan ttl)
{
_engine.ExecuteNonQuery(@"
INSERT OR REPLACE INTO Cache (Key, Value, ExpiresAt)
VALUES (@key, @value, @expires)",
new Dictionary<string, object?>
{
["key"] = key,
["value"] = value,
["expires"] = DateTime.UtcNow.Add(ttl)
});
}
public string? Get(string key)
{
var row = _engine.QueryFirstOrDefault(@"
SELECT Value FROM Cache
WHERE Key = @key AND ExpiresAt > @now",
new Dictionary<string, object?>
{
["key"] = key,
["now"] = DateTime.UtcNow
});
return row?["Value"].AsStringOrNull();
}
}
Temporary Processing: Hold intermediate results during complex operations.
public class ReportGenerator
{
public Report GenerateReport(IEnumerable<RawData> data)
{
using var db = WitDatabase.CreateInMemory();
using var engine = new WitSqlEngine(db, ownsStore: true);
// Create temp tables
engine.Execute("CREATE TABLE TempData (...)");
engine.Execute("CREATE TABLE TempAggregates (...)");
// Load and transform data
var stmt = engine.Prepare("INSERT INTO TempData ...");
foreach (var item in data)
{
stmt.Execute(/* params */);
}
// Run complex aggregations with SQL
engine.Execute(@"
INSERT INTO TempAggregates
SELECT Category, SUM(Amount), AVG(Price)
FROM TempData
GROUP BY Category");
// Build report from aggregates
var results = engine.Query("SELECT * FROM TempAggregates");
return BuildReport(results);
}
// In-memory database disposed, memory freed
}
Memory Considerations
When using the In-Memory engine, keep in mind:
Memory Usage: Data plus overhead (indexes, structure). A database with 1 GB of raw data might use 1.2-1.5 GB of memory.
No Persistence: All data is lost on dispose. If you need to preserve data, export it first or use a persistent engine.
Garbage Collection: Large in-memory databases can impact .NET garbage collection. Consider using GC.Collect() after disposing if memory pressure is a concern.
Concurrency: In-memory operations are fast, but thread safety rules still apply. Use transactions for consistency.
Limitations
The In-Memory engine has some limitations compared to persistent engines:
Feature
In-Memory
Persistent (B+Tree/LSM)
Data survives restart
❌ No
✅ Yes
Crash recovery
❌ No
✅ Yes
Data larger than RAM
❌ No
✅ Yes
Multiple processes
❌ No
✅ Yes (file locking)
Backup/Restore
❌ Not directly
✅ Yes
5. Choosing the Right Engine
Selecting the right storage engine is one of the most important decisions for your application's performance. This section provides a decision framework to help you choose.
Decision Flowchart
[[Svg Src="./witdatabase-storage-engine-decision.svg" Alt="Witdatabase Storage Engine Decision"]]
Workload Patterns
Different workloads have different characteristics. Match your workload to the right engine:
Workload Pattern
Description
Recommended Engine
OLTP
Many small transactions, mixed reads/writes
B+Tree
Logging
Continuous writes, occasional reads
LSM-Tree
Time-series
Timestamped data, range queries
LSM-Tree
Key-value cache
Fast lookups, moderate updates
B+Tree or In-Memory
Session storage
Short-lived data, fast access
In-Memory
Event sourcing
Append-only writes
LSM-Tree
Analytics
Large scans, complex queries
B+Tree
Testing
Isolated, fast, disposable
In-Memory
Real-World Scenarios
Here are concrete recommendations for common application types:
Web Application (E-commerce, CRM, SaaS)
Choose: B+Tree
Rationale: Mixed read/write workload, need for consistent performance, transactional integrity. Users expect fast page loads (reads) and reliable checkout/updates (writes).
var db = new WitDatabaseBuilder()
.WithFilePath("app.witdb")
.WithBTree()
.WithCacheSize(5000) // Large cache for product catalog
.WithTransactions()
.WithMvcc() // Handle concurrent users
.Build();
Logging Service
Choose: LSM-Tree
Rationale: Write-heavy workload (logs arriving constantly), sequential writes, reads are less frequent (when investigating issues).
var db = new WitDatabaseBuilder()
.WithLsmTree("./logs", options =>
{
options.MemTableSizeLimit = 32 * 1024 * 1024; // 32 MB buffer
options.BackgroundCompaction = true;
})
.Build();
IoT Data Collection
Choose: LSM-Tree
Rationale: Continuous sensor data streams, very high write throughput, time-range queries for analysis.
var db = new WitDatabaseBuilder()
.WithLsmTree("./sensor-data", options =>
{
options.MemTableSizeLimit = 64 * 1024 * 1024; // Large buffer
options.BlockCacheSizeBytes = 256 * 1024 * 1024; // Cache for queries
})
.Build();
Desktop Application (Note-taking, Document Management)
Choose: B+Tree
Rationale: Single user, mixed operations, need for single-file deployment, reliable storage.
var appData = Environment.GetFolderPath(Environment.SpecialFolder.LocalApplicationData);
var db = new WitDatabaseBuilder()
.WithFilePath(Path.Combine(appData, "MyApp", "data.witdb"))
.WithBTree()
.WithEncryption(userPassword)
.Build();
Test Suite
Choose: In-Memory
Rationale: Speed, isolation between tests, no cleanup needed, no filesystem dependencies.
[TestFixture]
public class RepositoryTests
{
[Test]
public async Task CreateUser_ShouldPersist()
{
await using var db = WitDatabase.CreateInMemory();
var repo = new UserRepository(db);
var user = await repo.CreateAsync("test@example.com");
Assert.IsNotNull(user.Id);
}
}
Real-time Analytics Dashboard
Choose: B+Tree with large cache
Rationale: Read-heavy (dashboard queries), periodic writes (data imports), complex aggregations benefit from consistent read performance.
var db = new WitDatabaseBuilder()
.WithFilePath("analytics.witdb")
.WithBTree()
.WithCacheSize(20000) // Very large cache
.WithPageSize(8192) // Larger pages for scans
.Build();
Migration Between Engines
You can change storage engines, but it requires data migration:
// Export from B+Tree
using var sourceDb = WitDatabase.Open("source-btree.witdb");
using var sourceEngine = new WitSqlEngine(sourceDb, ownsStore: false);
// Create LSM-Tree destination
using var destDb = new WitDatabaseBuilder()
.WithLsmTree("./dest-lsm")
.Build();
using var destEngine = new WitSqlEngine(destDb, ownsStore: false);
// Copy schema
var tables = sourceEngine.Query("SELECT name FROM sqlite_master WHERE type='table'");
foreach (var table in tables)
{
var tableName = table["name"].AsString();
var createSql = sourceEngine.ExecuteScalar(
Loading...
quot;SELECT sql FROM sqlite_master WHERE name = '{tableName}'").AsString();
destEngine.Execute(createSql);
}
// Copy data table by table
foreach (var table in tables)
{
var tableName = table["name"].AsString();
var rows = sourceEngine.Query(
Loading...
quot;SELECT * FROM {tableName}");
if (rows.Count > 0)
{
var columns = rows[0].ColumnNames;
var insertSql =
Loading...
quot;INSERT INTO {tableName} ({string.Join(", ", columns)}) VALUES ({string.Join(", ", columns.Select(c =>
Loading...
quot;@{c}"))})";
using var stmt = destEngine.Prepare(insertSql);
foreach (var row in rows)
{
var parameters = columns.ToDictionary(c => c, c => row[c].Value);
stmt.Execute(parameters);
}
}
}
Hybrid Approaches
Sometimes you need different engines for different data:
// Main application data (B+Tree)
var mainDb = new WitDatabaseBuilder()
.WithFilePath("app.witdb")
.WithBTree()
.Build();
// Audit log (LSM-Tree)
var auditDb = new WitDatabaseBuilder()
.WithLsmTree("./audit-logs")
.Build();
// Session cache (In-Memory)
var sessionDb = WitDatabase.CreateInMemory();
This hybrid approach gives you:
- Fast, consistent access to main data (B+Tree)
- High-throughput audit logging (LSM-Tree)
- Ultra-fast session lookups (In-Memory)
6. Performance Comparison
Understanding the performance characteristics of each engine helps you set realistic expectations and optimize your application.
Benchmark Overview
These benchmarks were run on typical hardware (Intel Core i7, NVMe SSD, 16GB RAM). Your results may vary based on hardware, data patterns, and configuration.
Write Performance
Writing 100,000 key-value pairs (each ~100 bytes):
Engine
Time
Throughput
Notes
In-Memory
45 ms
2.2M ops/sec
Limited by CPU/memory bandwidth
LSM-Tree
180 ms
555K ops/sec
Sequential writes to MemTable
B+Tree
850 ms
117K ops/sec
Random I/O, page splits
With WAL disabled (LSM-Tree) or transactions disabled (B+Tree), write performance improves but durability is lost:
Engine
With Durability
Without Durability
LSM-Tree
180 ms
120 ms
B+Tree
850 ms
350 ms
Read Performance
Reading 100,000 random keys (after data is loaded):
Engine
Cold Cache
Warm Cache
Notes
In-Memory
N/A
25 ms
Always "warm"
B+Tree
450 ms
80 ms
Single tree traversal
LSM-Tree
650 ms
150 ms
Multiple files, Bloom filters help
The "cold cache" scenario is relevant when data doesn't fit in memory. With warm cache, B+Tree approaches In-Memory performance.
Range Scan Performance
Scanning 10,000 consecutive keys:
Engine
Time
Notes
In-Memory
5 ms
Sequential memory access
B+Tree
15 ms
Linked leaf nodes
LSM-Tree
35 ms
Merge across SSTables
B+Tree excels at range scans because leaf nodes are linked. LSM-Tree must merge results from multiple files.
Bulk Insert Performance
Inserting 1 million records in a single transaction:
Engine
Time
Throughput
In-Memory
2.1 s
476K ops/sec
LSM-Tree
4.5 s
222K ops/sec
B+Tree
18 s
55K ops/sec
LSM-Tree's sequential write pattern shines for bulk loads. B+Tree's page splits become a bottleneck.
Memory Usage
Memory consumption for 1 million key-value pairs (~100 bytes each):
Engine
Data on Disk
Memory Usage
Notes
In-Memory
N/A
~150 MB
Data + structure overhead
B+Tree
~120 MB
8-40 MB
Cache size configurable
LSM-Tree
~130 MB
70-150 MB
MemTable + Block Cache
Summary Table
Metric
B+Tree
LSM-Tree
In-Memory
Random write
⭐⭐⭐
⭐⭐⭐⭐⭐
⭐⭐⭐⭐⭐
Sequential write
⭐⭐⭐
⭐⭐⭐⭐⭐
⭐⭐⭐⭐⭐
Random read
⭐⭐⭐⭐⭐
⭐⭐⭐
⭐⭐⭐⭐⭐
Range scan
⭐⭐⭐⭐⭐
⭐⭐⭐⭐
⭐⭐⭐⭐⭐
Memory efficiency
⭐⭐⭐⭐⭐
⭐⭐⭐
⭐⭐
Durability
⭐⭐⭐⭐⭐
⭐⭐⭐⭐⭐
❌
Predictability
⭐⭐⭐⭐⭐
⭐⭐⭐
⭐⭐⭐⭐⭐
7. Quick Reference
Engine Selection
Scenario
Engine
Reason
General purpose
B+Tree
Balanced performance
Write-heavy (>80% writes)
LSM-Tree
Optimized for writes
Read-heavy (>80% reads)
B+Tree
Fast point queries
Logging / Events
LSM-Tree
Sequential writes
Testing
In-Memory
Speed, isolation
Caching
In-Memory or B+Tree
Depends on persistence needs
Time-series
LSM-Tree
Continuous writes
OLTP
B+Tree
Mixed workload, consistency
Configuration Quick Reference
B+Tree Configuration:
// Connection string
"Data Source=app.witdb;Store=btree;Page Size=4096;Cache Size=2000"
// Builder pattern
var db = new WitDatabaseBuilder()
.WithFilePath("app.witdb")
.WithBTree()
.WithPageSize(4096)
.WithCacheSize(2000)
.Build();
LSM-Tree Configuration:
// Connection string
"Data Source=./data;Store=lsm;MemTable Size=8388608;Block Cache Size=134217728"
// Builder pattern
var db = new WitDatabaseBuilder()
.WithLsmTree("./data", options =>
{
options.MemTableSizeLimit = 8 * 1024 * 1024;
options.BlockCacheSizeBytes = 128 * 1024 * 1024;
options.BackgroundCompaction = true;
})
.Build();
In-Memory Configuration:
// Connection string
"Data Source=:memory:"
// Builder pattern
var db = WitDatabase.CreateInMemory();
// With encryption
var db = WitDatabase.CreateInMemory("password");
Parameter Reference
B+Tree Parameters:
Parameter
Default
Description
Page Size
4096
Page size in bytes (4096-65536)
Cache Size
1000
Number of pages to cache
Cache
clock
Cache algorithm (clock, lru)
LSM-Tree Parameters:
Parameter
Default
Description
MemTable Size
4 MB
MemTable size before flush
Block Cache Size
64 MB
Read cache size
Sync Writes
true
Sync WAL on each write
Background Compaction
true
Auto-compact SSTables
LSM WAL
true
Enable write-ahead log
LSM Block Size
4096
SSTable block size
LSM Compaction Trigger
4
L0 files before compaction
Storage Format
Engine
Format
Files
B+Tree
Single file
app.witdb
LSM-Tree
Directory
./data/ with .sst, .wal files
In-Memory
None
No files
Feature Support
Feature
B+Tree
LSM-Tree
In-Memory
Transactions
✅
✅
✅
MVCC
✅
✅
✅
Encryption
✅
✅
✅
Crash recovery
✅
✅
❌
File locking
✅
✅
N/A
Blazor WASM
✅
❌
✅