Skip to content

Commit a657f4b

Browse files
committed
Update the docs a bit
1 parent 2aafe9e commit a657f4b

File tree

4 files changed

+118
-101
lines changed

4 files changed

+118
-101
lines changed

docs/BenchmarkLog.md

Lines changed: 0 additions & 97 deletions
This file was deleted.

docs/GettingStarted.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -16,10 +16,12 @@ accidentally instantiated).
1616
```csharp
1717
public static class File
1818
{
19-
public static readonly Attribute<ulong> Hash = new("Test.Model.File/Hash", isIndexed: true);
20-
public static readonly Attribute<ulong> Size = new("Test.Model.File/Size");
21-
public static readonly Attribute<string> Name = new("Test.Model.File/Name", noHistory: true);
22-
public static readonly Attribute<EntityId> ModId = new"Test.Model.File/ModId", cardinality: Cardinality.Many);
19+
private const string Namespace = "Test.Model.File";
20+
21+
public static readonly ScalarAttribute<ulong> Hash = new(Namespace, nameof(Hash), isIndexed: true);
22+
public static readonly ScalarAttribute<ulong> Size = new(Namespace, nameof(Size));
23+
public static readonly ScalarAttribute<string> Name = new(Namespace, nameof(Name), noHistory: true);
24+
public static readonly ReferenceAttribute<EntityId> ModId = new(Namespace, nameof(ModId), cardinality: Cardinality.Many);
2325

2426
public class Model(ITransaction tx) : AEntity(tx)
2527
{

docs/ValueFormat.md

Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
---
2+
hide:
3+
- toc
4+
---
5+
6+
## Value Format
7+
8+
Originally, MnemonicDB was developed using out-of-band value type serialization. That is to say, each attribute had a C# name
9+
attached to it, that would be used at read time to determine the format of the value. This method was simple, but had several
10+
side effects. For one, it was impossible to read the data without having access to that C# class. Since RocksDB performs
11+
some value comparisons at startup after a crash, this resulted in an unreadable database. The RocksDB couldn't start because
12+
it needed a comparator, but the comparator couldn't start until RocksDB had properly started and the comparator could read
13+
all the possible value types. Thus the internals of MnemonicDB were rewritten to use a more standard format.
14+
15+
### Value Format
16+
Every value in the system is prefixed by a one-byte type identifier. This identifier determines the size, value type, and
17+
serialization format. As of the time of this writing we are only using about 16 of the possible 256 values, so there is
18+
plenty of room for expansion. These formats are stored in the `ValueTags` enum, and currently support the following values:
19+
20+
```csharp
21+
public enum ValueTags : byte
22+
{
23+
/// <summary>
24+
/// Null value, no data
25+
/// </summary>
26+
Null = 0,
27+
/// <summary>
28+
/// Unsigned 8-bit integer
29+
/// </summary>
30+
UInt8 = 1,
31+
/// <summary>
32+
/// Unsigned 16-bit integer
33+
/// </summary>
34+
UInt16 = 2,
35+
/// <summary>
36+
/// Unsigned 32-bit integer
37+
/// </summary>
38+
UInt32 = 3,
39+
/// <summary>
40+
/// Unsigned 64-bit integer
41+
/// </summary>
42+
UInt64 = 4,
43+
/// <summary>
44+
/// Unsigned 128-bit integer
45+
/// </summary>
46+
UInt128 = 5,
47+
/// <summary>
48+
/// Unsigned 16-bit integer
49+
/// </summary>
50+
Int16 = 6,
51+
/// <summary>
52+
/// Unsigned 32-bit integer
53+
/// </summary>
54+
Int32 = 7,
55+
/// <summary>
56+
/// Unsigned 64-bit integer
57+
/// </summary>
58+
Int64 = 8,
59+
/// <summary>
60+
/// Unsigned 128-bit integer
61+
/// </summary>
62+
Int128 = 9,
63+
/// <summary>
64+
/// 32-bit floating point number
65+
/// </summary>
66+
Float32 = 10,
67+
/// <summary>
68+
/// 64-bit floating point number (double)
69+
/// </summary>
70+
Float64 = 11,
71+
/// <summary>
72+
/// ASCII string, case-sensitive
73+
/// </summary>
74+
Ascii = 12,
75+
/// <summary>
76+
/// UTF-8 string, case-sensitive
77+
/// </summary>
78+
Utf8 = 13,
79+
/// <summary>
80+
/// UTF-8 string, case-insensitive
81+
/// </summary>
82+
Utf8Insensitive = 14,
83+
/// <summary>
84+
/// Inline binary data
85+
/// </summary>
86+
Blob = 15,
87+
88+
/// <summary>
89+
/// A blob sorted by its xxHash64 hash, and where the data is possibly stored in a separate location
90+
/// as to degrade the performance of the key storage
91+
/// </summary>
92+
HashedBlob = 16,
93+
94+
/// <summary>
95+
/// A reference to another entity
96+
/// </summary>
97+
Reference = 17,
98+
}
99+
```
100+
101+
Many of these values have a fixed size and are self-describing. Since the format is so simple, we can "serialize" data
102+
such as integers by doing a simple pointer dereference. For other more complex values like strings, we must run them
103+
through a text encoder/decoder. None of the variable sized values have an encoded length, this is because RocksDB tracks
104+
value sizes, so it can be assumed that every key is a 16 byte header, followed by a ValueTag, followed by the value, with
105+
the value taking up the remainder of the key.
106+
107+
### Comparator Simplicity
108+
Since this value/key format is so simple it's possible in a few hundred lines of code to write a comparator for RocksDB
109+
to sort this data. In MnemonicDB the comparator is a completely static method without any virtual dispatch in the main-line
110+
code. In the future it would be fairly simple to move the comparator code into a C++ DLL to further squeeze up some performance,
111+
but this is considered a low priority at the moment.

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,5 +49,6 @@ theme:
4949
nav:
5050
- Home: index.md
5151
- Index Format: IndexFormat.md
52+
- Value Format: ValueFormat.md
5253
- Schema Changes: SchemaChanges.md
5354

0 commit comments

Comments
 (0)