Core concepts

A while back ago I started to fiddle with Microsofts CTP edition of code first in Entity framework 4. The product is great but I wanted something else, I wanted something “more” schemaless. I turned to MongoDB and wrote together an open source driver targeting .Net 4. Still, there was things that I didn’t like so I built a document DB over Lucene. I relatively quickly discovered that I missed all the great infrastructure that SQL-server provides. Security, replication, scheduler etc, so I prototyped a solution that used JSON to create a document/structure provider over SQL-server, namely: Simple Structure Oriented Db (SisDb).

How is data stored?

All objects (structures) you pass to SisoDb is stored as JSON and as key-value indexes. SisoDb stores your POCO-graphs (plain old clr objects) using JSON which enables us to go from a POCO to persistable JSON. For each type of entity there will be three tables (tables are created on the fly). One table, the Structure-table, is to be seen as the master table and it holds the Id of the entity and the JSON-representation. The second table, Indexes-table, holds the value of each property in the object graph, with the purpose of providing a more performant and query friendly representation of the entity. The third table, Uniques-table, holds values of the scalar properties that you mark as unique. Unique constraints. To enforce data integrity, there's of course relationships between all tables.

The complete graph is serialized and deserialized so there's no magical lazy loading and proxies. Instead you have to think a bit before designing deep graphs and have to learn your self in thinking in documents. It isn't a relational data model you are working in. You can off course reference other documents and include them in the single query to reduce the amount of roundtrips and avoid n+1 scenarios. But when working with documents, duplication of data is more "ok" than with RDMS models.

JSON

The JSON representation of the structure is what's being deserialized back when you query for your documents. This gives you opportunities to store an structure as a class or interface and then return it as something completely else. You can read more about it here: Store as X, return as Y.

Key-value indexes

The primary concern of this table is for querying. All Sql-queries generated by SisoDb, unless you query by id, are translated and executed against this table. It gets created the first time you use execute a command against a UnitOfWork or QueryEngine for a certain type of structure (Person, Customer, Order, ...) a schema for that structure is constructed, and cached in the Database instance you created via your SisoDbFactory. This schema is essentially made up of property accessors, that via IL EMITs accesses the values of your structures and gives them a key. The key is the complete member path of the property.

public class Customer
{
    public Guid StructureId { get; set; }
    public int CustomerNo { get; set; }
    public Address Address { get; set; }
}

public class Address
{
    public string Street { get; set; }
    public string Zip { get; set; }
    public string City { get; set; }
}

This will create six different cached index accessors, with the member path and type info:

StructureId : Guid
CustomerNo : int
Address.Street : string
Address.Zip : string
Address.City : string

Now, everytime you insert or update a structure into the database, this cached accessors will be used to extract the values of the structure. Each value will be stored using the member path as the key and the value as the typed value in a certain indexes table. If you look at JSON and the indexes as data, this distinction of the JSON being stored in one set and the indexes in another set, you actually do have similarities with RavenDb. The store the JSON in Essent and the Indexes in Lucene. I'm looking at giving you an option to select whether you want to store the indexes in the SQL-server or in Lucene.

Unique constraints

There's one attribute you can use in SisoDb, and it's the UniqueAttribute. Using it you can mark a scalar property as being unique, either per instance or per type.

public class Customer
{
    public Guid StructureId { get; set; }

    [Unique(UniqueModes.PerType)]
    public int CustomerNo { get; set; }

    ...
}

When doing this, a checksum is generated for the CustomerNo and is inserted in the Unques-table, which has some unique constraints in it.

Model updates

Well if you keep your self to the open-closed principle and just add new properties nothing is needed, it all works. You also don't have to do anything if you drop properties. So, if you remove a member from the model, values for the member is deleted from the indexes-table and the uniques-table. The structure-table will of course contain the "truth", the JSON, which will not be updated until the structure is "touched" - fetched and re-saved.

More complex model updates?

There is the concept of using a structure set updater to handle more complex model updates. Read more about it here

Provide feedback

Saved searches

Use saved searches to filter your results more quickly