-
Notifications
You must be signed in to change notification settings - Fork 1.3k
How LiteDB Works
LiteDB is a single file database. But databases has many different types of information, like indexes, collections, documents. To manage this, LiteDB implements database pages concepts. Page is a block of same information type and has 4096 bytes. Page is the minimum read/write operation on disk file. There are 6 page types:
-
Header Page: Contains database information, like file version, data file size and pointer to free list pages. Is the first page on database (
PageID
= 0). - Collection Page: Each collection use one page and hold all collection information, like name, indexes, pointers and options. All collections are connected each others by a double linked list.
- Index Page: Used to hold index nodes. LiteDB implement skip list indexes, so each node has one key and levels link pointers to others index nodes.
- Data Page: Data page contains data blocks. Each data block represent an document serialized in BSON format. If a document is bigger than one page, data block use a link pointer to an extended page.
- Extend Page: Big documents that need more than one page, are serialized in multiples extended pages. Extended pages are double linked to create a single data segment that to store documents. Each extend page contains only one document or a chunk of a document.
- Empty Page: When a page is excluded becomes a empty page. Empty pages will be use on next page request (for any king of page).
Each page has a own header and content. Header is used to manage common data structure like PageID, PageType, Free Space. Content are implement different on each page type.
Index pages and data pages contains a collection of elements (index nodes and data blocks). This pages can store data and keep with available space to more. To hold this free space on each page type, LiteDB implements free list pages.
Free list are a double linked list, sorted by available space. When database need a page to store data use this list to search first available page. After this, PagerService
fix page order or remove from free list if there is no more space on page.
To create near data related, each collection contains an data free list. So, in a data page, all data blocks are of same collection. The same occurs in indexes. Each index (on each collection) contains your own free list. This solution consume more disk space but are much faster to read/write operations because data are related and near one witch other. If you get all documents in a collection, database needs read less pages on disk.
You can use dump
command on shell to see all pages and basic information of each page. I used this to debug linked lists.
When you open a database file, you can just use file name as connection string or use more connection parameters like:
- Filename: Database to open or create (if no exits). Required,
String
- Journal: If this database connection will be use journal mode (See Journaling and Recovery). Optional,
Boolean
, default: true - Timeout: Set timeout to wait a locked transaction. Optional,
Timespan
, default: 00:01:00 - Password: Setting a password, your datafile will be encrypted using AES cryptography. Only header page (first 4096 bytes) are not encrypted.
- Initial Size: When datafile is new, create an empty disk with this size. Accepts common storage units (eg: "Initial Size: 10Mb").
- Limit Size: Limit datafile to a max size. Accepts common storage units (eg: "Limit Size: 20Mb"). Must be greater than initial size (if defined).
// Get MyData.db from current path
var db = new LiteDatabase("MyData.db");
// Hard code connection string
var db = new LiteDatabase("filename=C:\Path\mydb.db; journal=false; version=5");
- Collection Name:
- Pattern:
A-Z
,_-
,0-9
- Maxlength of 30 chars
- Case insensitive
- Pattern:
- Index Name:
- Pattern:
A-Z
,_-
,0-9
(.
for nested document) - Maxlength of 30 chars
- Case sensitive
- Pattern:
- BsonDocument Key Name:
-
A-Z
,_-
,0-9
- Case sensitive
-
- FileStorage FileId:
- Pattern: same used in file system path/file.
- Case sensitive
- Collections: 3000 bytes to all collections names
- Documents per collections:
UInt.MaxValue
- FileStorage Max File Length: 2Gb per file
- Index key size: 512 bytes after BSON serialization
- BSON document size: 1.044.224 bytes (~1Mb = 256 pages)
- Nested Depth for BSON Documents: 20
- Page Size: 4096 bytes
- DataPage Reserved Bytes: 2039 bytes (like PCT FREE)
- IndexPage Reserved Bytes: 100 bytes (like PCT FREE)
- Database Max Length: In theory,
UInt.MaxValue
* PageSize (4096) = ~ Too big!
Data Modeling
- Data Structure
- BsonDocument
- Object Mapping
- Relationships with Document References
- Collections
- FileStorage
Index
Query
Database
Version 4 changes
Shell