Build your own SQL Database using TypeScript
This project will help you grasp low-level systems without requiring any prior knowledge of database engineering. I’ve included all the essential details to guide you through how everything works, and I’ve also used ChatGPT to strengthen my understanding of the concepts. Hope this helps you.
Note
This project is a work in progress, and the content of this blog may be updated over time. We’re also documenting the journey through a playlist on our YouTube channel, so be sure to subscribe!
Introduction
As a developer, I wanted to understand how databases work under the hood, including data structures like B-Trees and Red-Black Trees. Instead of just reading about them, I decided to build my own database from scratch in TypeScript. This blog walks you through how we designed and implemented a realistic SQL-like database engine in TypeScript using:
- A custom parser for simple SQL syntax
- A B-tree-based key-value store
- A file-based storage system for data persistence
- Binary disk persistence
- Write-ahead logging (WAL)
- Support for multiple tables
- Basic SQL operations like SELECT, INSERT, UPDATE, and DELETE
Technologies
Feel free to choose any other alternatives you are familiar with. These are my choices:
- TypeScript - We'll be using TypeScript as our core language but you can go along with GO lang.
- Bun - We'll be using Bun as our runtime environment, which is a fast JavaScript runtime like Node.js but with better performance and built-in TypeScript support.
Project Setup
- Language & Runtime: TypeScript + Bun
- Folder Structure:
LumeDB/
├── src/
│ ├── storage/
│ │ └── pager.ts
│ ├── wal/
│ │ └── wal.ts
│ ├── btree/
│ │ └── btree.ts
│ ├── engine/
│ │ ├── record.ts
│ │ └── query.ts
│ ├── parser/
│ │ └── sqlParser.ts
│ └── main.ts
├── db/
│ ├── data.db
│ └── wal.log
├── package.json
├── tsconfig.json
🧱 Step 1: The Basic CLI Shell
We start with a command-line interface using "readline-sync".
const db = new DB('db/data.db', 'db/wal.log');
while (true) {
const input = readlineSync.question('> ');
const result = db.execute(input.trim());
console.log(result);
}
This REPL-style loop allows us to input SQL commands and interact with the database engine in real time.
Benefits:
- Simple to implement
- Interactive testing
Improvements:
- Use a web API or REPL framework for remote or web-based interaction
🧠 Step 2: Core Database Logic
To handle SQL commands, we need a parser that can interpret basic SQL syntax. We built a custom parser to handle basic SQL commands:
INSERT INTO table VALUES (1, "Alice")
SELECT * FROM table WHERE id = 1
DELETE FROM table WHERE id = 1
This is handled by parseSQL() which returns a structured command object with:
{
command: 'INSERT' | 'SELECT' | 'DELETE',
table: 'tableName',
values: [1, "Alice"],
where: { id: 1 }
}
Improvements:
- Use a parser generator like PEG.js or ANTLR for extensibility
🌳 Step 3: B-Tree In-Memory Store
We use a simple in-memory B-tree implementation for fast key-value access:
this.btree.insert(key, value);
this.btree.search(key);
this.btree.delete(key);
Benefits:
- Logarithmic access time
- Balanced tree prevents degeneration
Real-World Upgrade:
- Use a B+ tree where leaf nodes are linked for fast range scans
- Store tree nodes in fixed-size pages with serialization
💾 Step 4: Persistent Storage with Pager
Instead of keeping data only in memory, we use a Pager that writes to disk:
const buffer = Buffer.from(`${key}:${value}\n`);
fs.writeFileSync(this.path, buffer);
This simulates pages by storing rows as key:value strings.
Limitations:
- Not real page-based binary format yet
- No offset/index-based updates
Future Work:
- Use fixed-size binary pages (e.g. 4096 bytes)
- Store records at offsets, reuse deleted slots
🔐 Step 5: Write-Ahead Log (WAL)
We implemented WAL to ensure durability:
this.wal.log(`INSERT users 1 Alice`);
// this.wal.flush();
Why WAL?
The Write-Ahead Log (WAL) is crucial for database reliability:
- Crash Safety: If the database crashes, we can replay the WAL to recover the last consistent state
- Atomicity: WAL ensures that transactions are either fully committed or fully rolled back
- Recovery: Provides a way to restore the database to a known good state
Production Suggestions:
For a production-ready WAL implementation:
- Log Compaction: Periodically compact the WAL by removing redundant entries
- Table-Specific WALs: Maintain separate WAL files per table for better concurrency
- Checkpoints: Regularly create checkpoints to reduce WAL replay time
📁 Step 6: Multi-Table Support
Now we support multiple tables by using a Map:
this.tables: Map<string, { btree, pager }>
this.tables.set('users', { btree: new BTree(), pager: new Pager('db/users.db') });
Each table has its own B-tree and Pager file like data.db.users.
Benefits:
- Simulates real RDBMS architecture
- Isolates tables
Improvements:
- Support CREATE TABLE command
- Normalize table schema definition
🔚 Conclusion
We've built a realistic SQL-like database from scratch using simple but scalable techniques:
- SQL command parser
- B-tree structure
- Persistent storage
- WAL logging
- Multi-table support
This is a powerful learning experience for any developer and a stepping stone to more advanced database design.
📌 GitHub Repository
If you'd like a downloadable template of this codebase or want to contribute, here you go
Let me know which part you’d like to explore next: pages, indexes, transactions, or query planner?
Happy hacking! 🛠️