B+Tree implementation in Kotlin
This project is my learning project for B+Tree (also my first Kotlin experience). It is heavily inspired by InnoDB. So far the list below is implemented in total 963 lines of code:
I stopped here since I had learned a lot. The list below is lots of TODO:
This B+Tree implementation is mostly generic/schema free. You can use any byte array representation for key and value, but I use Apache Avro for demo purpose since it is sortable even in byte array form (i.e. without decoding)
First, initialize the db file:
$ ./runsample init foo.db
Then, you can get/put/delete/scan
the B+Tree (The default schema is under /sample
directory):
$ ./runsample open foo.db
COMMAND: put
KEY: {"name": "foo", "timestamp": 1}
VALUE: {"value": "bbb"}
COMMAND: get
KEY: {"name": "foo", "timestamp": 1}
{"name": "foo", "timestamp": 1} {"value": "bbb"}
COMMAND: put
KEY: {"name": "foo", "timestamp": 10}
VALUE: {"value": "bbb"}
COMMAND: scan
START KEY: {"name": "foo", "timestamp": 0}
END KEY: {"name": "foo", "timestamp": 100}
{"name": "foo", "timestamp": 1} {"value": "bbb"}
{"name": "foo", "timestamp": 10} {"value": "bbb"}
COMMAND: put
KEY: {"name": "bar", "timestamp": 10}
VALUE: {"value": "bbb"}
COMMAND: put
KEY: {"name": "bar", "timestamp": 1}
VALUE: {"value": "bbb"}
COMMAND: scan
START KEY: {"name": "foo", "timestamp": 0}
END KEY: {"name": "foo", "timestamp": 100}
{"name": "foo", "timestamp": 1} {"value": "bbb"}
{"name": "foo", "timestamp": 10} {"value": "bbb"}
COMMAND: scan
START KEY: {"name": "bar", "timestamp": 0}
END KEY: {"name": "foo", "timestamp": 100}
{"name": "bar", "timestamp": 1} {"value": "bbb"}
{"name": "bar", "timestamp": 10} {"value": "bbb"}
{"name": "foo", "timestamp": 1} {"value": "bbb"}
{"name": "foo", "timestamp": 10} {"value": "bbb"}
This is a class for B+Tree itself. It supports the operations below:
This is a (set of) class for Node of B+Tree.
This class stores all KeyValue on backend Page.
This class stores minimum key of each child and its Page id as KeyValue.
This class is for the RootNode. RootNode is LeafNode for the first time, then it becomes InternalNode after the first split. So, it inherits both.
This is a class for Page of B+Tree. Page is backend data structure for all Node classes. PageData is Avro record:
{
"name": "PageData",
"type": "record",
"fields": [
{"name": "id", "type": "PageId"},
{"name": "nodeType", "type": "NodeType"},
{"name": "previousId", "type": "PageId"},
{"name": "nextId", "type": "PageId"},
{"name": "records", "type": {"type": "array", "items": "KeyValue"} }
]
}
So, the Page is encoded/decoded as Avro’s binary format (single-object encoding).
It supports mutational operations below and calculates encoded byte size to check whether the Page is full:
This class is for managing file read and write. The file has fixed size (128 bytes) metadata at the beginning, then each Page (256 bytes) is stored ordered by Page id.
This class is for managing Page lifecycle. It has memory pool for pages (currently just a Map
).
Currently, there is only one split strategy: middle of size. This is the best approach for random insert since a new KeyValue comes random place and half split pages could be filled later.
However, if the insert is in order (both ascending and descending), it is not optimized. Because a new KeyValue is always inserted to one node, rest of nodes are left with half usage.
InnoDB has an optimization to solve this problem (insertion order). See this great post.
RootNode(id:0 size=93 records=14)
key=
LeafNode(id:1 size=163 keys=[00:01, 00:02, 00:03])
key=00:04
LeafNode(id:11 size=163 keys=[00:04, 00:05, 00:06])
key=00:07
LeafNode(id:7 size=209 keys=[00:07, 00:08, 00:09, 00:0a])
key=00:0b
LeafNode(id:9 size=163 keys=[00:0b, 00:0c, 00:0d])
key=00:0e
LeafNode(id:5 size=209 keys=[00:0e, 00:0f, 00:10, 00:11])
key=00:12
LeafNode(id:10 size=163 keys=[00:12, 00:13, 00:14])
key=00:15
LeafNode(id:3 size=163 keys=[00:15, 00:16, 00:17])
key=00:18
LeafNode(id:14 size=163 keys=[00:18, 00:19, 00:1a])
key=00:1b
LeafNode(id:2 size=255 keys=[00:1b, 00:1c, 00:1d, 00:1e, 00:1f])
key=00:20
LeafNode(id:12 size=209 keys=[00:20, 00:21, 00:22, 00:23])
key=00:24
LeafNode(id:4 size=163 keys=[00:24, 00:25, 00:26])
key=00:27
LeafNode(id:13 size=163 keys=[00:27, 00:28, 00:29])
key=00:2a
LeafNode(id:6 size=209 keys=[00:2a, 00:2b, 00:2c, 00:2d])
key=00:2e
LeafNode(id:8 size=255 keys=[00:2e, 00:2f, 00:30, 00:31, 00:32])
RootNode(id:0 size=103 records=16)
key=
LeafNode(id:1 size=163 keys=[00:01, 00:02, 00:03])
key=00:04
LeafNode(id:2 size=163 keys=[00:04, 00:05, 00:06])
key=00:07
LeafNode(id:3 size=163 keys=[00:07, 00:08, 00:09])
key=00:0a
LeafNode(id:4 size=163 keys=[00:0a, 00:0b, 00:0c])
key=00:0d
LeafNode(id:5 size=163 keys=[00:0d, 00:0e, 00:0f])
key=00:10
LeafNode(id:6 size=163 keys=[00:10, 00:11, 00:12])
key=00:13
LeafNode(id:7 size=163 keys=[00:13, 00:14, 00:15])
key=00:16
LeafNode(id:8 size=163 keys=[00:16, 00:17, 00:18])
key=00:19
LeafNode(id:9 size=163 keys=[00:19, 00:1a, 00:1b])
key=00:1c
LeafNode(id:10 size=163 keys=[00:1c, 00:1d, 00:1e])
key=00:1f
LeafNode(id:11 size=163 keys=[00:1f, 00:20, 00:21])
key=00:22
LeafNode(id:12 size=163 keys=[00:22, 00:23, 00:24])
key=00:25
LeafNode(id:13 size=163 keys=[00:25, 00:26, 00:27])
key=00:28
LeafNode(id:14 size=163 keys=[00:28, 00:29, 00:2a])
key=00:2b
LeafNode(id:15 size=163 keys=[00:2b, 00:2c, 00:2d])
key=00:2e
LeafNode(id:16 size=255 keys=[00:2e, 00:2f, 00:30, 00:31, 00:32])
RootNode(id:0 size=103 records=16)
key=
LeafNode(id:1 size=255 keys=[00:01, 00:02, 00:03, 00:04, 00:05])
key=00:06
LeafNode(id:16 size=163 keys=[00:06, 00:07, 00:08])
key=00:09
LeafNode(id:15 size=163 keys=[00:09, 00:0a, 00:0b])
key=00:0c
LeafNode(id:14 size=163 keys=[00:0c, 00:0d, 00:0e])
key=00:0f
LeafNode(id:13 size=163 keys=[00:0f, 00:10, 00:11])
key=00:12
LeafNode(id:12 size=163 keys=[00:12, 00:13, 00:14])
key=00:15
LeafNode(id:11 size=163 keys=[00:15, 00:16, 00:17])
key=00:18
LeafNode(id:10 size=163 keys=[00:18, 00:19, 00:1a])
key=00:1b
LeafNode(id:9 size=163 keys=[00:1b, 00:1c, 00:1d])
key=00:1e
LeafNode(id:8 size=163 keys=[00:1e, 00:1f, 00:20])
key=00:21
LeafNode(id:7 size=163 keys=[00:21, 00:22, 00:23])
key=00:24
LeafNode(id:6 size=163 keys=[00:24, 00:25, 00:26])
key=00:27
LeafNode(id:5 size=163 keys=[00:27, 00:28, 00:29])
key=00:2a
LeafNode(id:4 size=163 keys=[00:2a, 00:2b, 00:2c])
key=00:2d
LeafNode(id:3 size=163 keys=[00:2d, 00:2e, 00:2f])
key=00:30
LeafNode(id:2 size=163 keys=[00:30, 00:31, 00:32])