- Overview
- Type factory methods
- Standard type constants
.tasl
DSL- Binary codec
- Advanced type utilities
A Schema
is a runtime representation of a dataset schema. Abstractly, a schema is a set of classes, which are analogous to tables in SQL. Each class has a key and a type. Keys are absolute URIs, and types are terms in a grammar of algebraic data types generated by the primitive types and two kinds of composite types (sums and products).
The tasl JavaScript library represents regular ES6 class Schema
at the top level. To instantiate a schema using the class constructor, we need to pass in a runtime representation of each class's type. Types are represented as regular JavaScript objects in the types
namespace, each discriminated by a .kind
property.
declare class Schema {
constructor(readonly classes: Record<string, types.Type>)
count(): number
get(key: string): types.Type
has(key: string): boolean
keys(): Iterable<string>
values(): Iterable<types.Type>
entries(): Iterable<[string, types.Type, number]>
isEqualTo(schema: Schema): boolean
}
namespace types {
type Type = URI | Literal | Product | Coproduct | Reference
type URI = { kind: "uri" }
type Literal = { kind: "literal"; datatype: string }
type Product = { kind: "product"; components: Record<string, Type> }
type Coproduct = { kind: "coproduct"; options: Record<string, Type> }
type Reference = { kind: "reference"; key: string }
}
Notice that the Schema
class and types
namespace are two separate top-level exports. types
contains TypeScript types and utility methods for working with the building blocks of schemas, while the Schema
class is mostly treated as an opaque object once instantiated. This is a pattern that the other data structures follow as well - the Instance
constructor takes values from the values
namespace, and the Mapping
constructor takes values from the expressions
namespace.
Here's an example schema.
import { Schema } from "tasl"
const schema = new Schema({
"http://schema.org/Person": {
kind: "product",
components: {
"http://schema.org/name": {
kind: "product",
components: {
"http://schema.org/givenName": {
kind: "literal",
datatype: "http://www.w3.org/2001/XMLSchema#string",
},
"http://schema.org/familyName": {
kind: "literal",
datatype: "http://www.w3.org/2001/XMLSchema#string",
},
},
},
"http://schema.org/email": { kind: "uri" },
},
},
"http://schema.org/Book": {
kind: "product",
components: {
"http://schema.org/name": {
kind: "literal",
datatype: "http://www.w3.org/2001/XMLSchema#string",
},
"http://schema.org/identifier": { kind: "uri" },
"http://schema.org/author": {
kind: "reference",
key: "http://schema.org/Person",
},
},
},
})
Our example is very structured but also very verbose. The types
namespace has factory methods for each kind of type that can help us simply this.
declare namespace types {
function uri(): URI
function literal(datatype: string): Literal
function product(components: Record<string, Type>): Product
function coproduct(options: Record<string, Type>): Coproduct
function reference(key: string): Reference
}
Here's the same example schema re-written using these factory methods.
import { Schema, types } from "tasl"
const schema = new Schema({
"http://schema.org/Person": types.product({
"http://schema.org/name": types.product({
"http://schema.org/givenName": types.literal(
"http://www.w3.org/2001/XMLSchema#string"
),
"http://schema.org/familyName": types.literal(
"http://www.w3.org/2001/XMLSchema#string"
),
}),
"http://schema.org/email": types.uri(),
}),
"http://schema.org/Book": types.product({
"http://schema.org/name": types.literal(
"http://www.w3.org/2001/XMLSchema#string"
),
"http://schema.org/identifier": types.uri(),
"http://schema.org/author": types.reference("http://schema.org/Person"),
}),
})
Still, passing explicit datatype URIs into types.literal(...)
for every literal type is still a huge hassle. In addition to the five factory methods for each kind of type, the types
namespace also defines constants for the unit type (the product type with no components), strings, booleans, 32- and 64-bit floats, 8-, 16-, 32- and 64-bit signed and unsigned integers, byte arrays, and JSON values.
This is essentially the standard library of common types that should cover the needs of most schemas.
declare namespace types {
const unit: Product
const string: Literal
const boolean: Literal
const f32: Literal
const f64: Literal
const i64: Literal
const i32: Literal
const i16: Literal
const i8: Literal
const u64: Literal
const u32: Literal
const u16: Literal
const u8: Literal
const bytes: Literal
const JSON: Literal
}
The datatypes that these literals use are from the XSD namespace, with the exception of JSON
, which (confusingly) is defined in the JSON-LD spec as a term in the rdf
namespace.
Here's the same example schema rewritten to use the types.string
constant instead of the types.literal(...)
factory.
import { Schema, types } from "tasl"
const schema = new Schema({
"http://schema.org/Person": types.product({
"http://schema.org/name": types.product({
"http://schema.org/givenName": types.string,
"http://schema.org/familyName": types.string,
}),
"http://schema.org/email": types.uri(),
}),
"http://schema.org/Book": types.product({
"http://schema.org/name": types.string,
"http://schema.org/identifier": types.uri(),
"http://schema.org/author": types.reference("http://schema.org/Person"),
}),
})
An even more concise way to instantiate schemas is to use the .tasl
DSL with the parseSchema
method. The DSL supports comments and URI namespaces, which dramatically improve readability.
The DSL is documented at https://tasl.io.
declare function parseSchema(input: string): Schema
import { parseSchema } from "tasl"
parseSchema(`
namespace s http://schema.org/
class s:Person {
s:name -> {
s:familyName -> string
s:givenName -> string
}
s:email -> uri
}
class s:Book {
s:name -> string
s:identifier -> uri
s:author -> * s:Person
}
`)
// Schema {
// classes: {
// 'http://schema.org/Person': { kind: 'product', components: [Object] },
// 'http://schema.org/Book': { kind: 'product', components: [Object] }
// }
// }
Schemas can be encoded and decoded from Uint8Arrays
with the top-level encodeSchema
and decodeSchema
methods.
declare function encodeSchema(schema: Schema): Uint8Array
declare function decodeSchema(data: Uint8Array): Schema
import { parseSchema, encodeSchema, decodeSchema } from "tasl"
const schema = parseSchema(`
namespace s http://schema.org/
class s:Person {
s:name -> string
s:email -> uri
}
`)
encodeSchema(schema)
// Uint8Array(124) [
// 1, 1, 24, 104, 116, 116, 112, 58, 47, 47, 115, 99,
// 104, 101, 109, 97, 46, 111, 114, 103, 47, 80, 101, 114,
// 115, 111, 110, 2, 0, 2, 23, 104, 116, 116, 112, 58,
// 47, 47, 115, 99, 104, 101, 109, 97, 46, 111, 114, 103,
// 47, 101, 109, 97, 105, 108, 0, 4, 22, 104, 116, 116,
// 112, 58, 47, 47, 115, 99, 104, 101, 109, 97, 46, 111,
// 114, 103, 47, 110, 97, 109, 101, 0, 1, 39, 104, 116,
// 116, 112, 58, 47, 47, 119, 119, 119, 46, 119, 51, 46,
// 111, 114, 103, 47,
// ... 24 more items
// ]
decodeSchema(encodeSchema(schema))
// Schema {
// classes: {
// 'http://schema.org/Person': { kind: 'product', components: [Object] }
// }
// }
schema.isEqualTo(decodeSchema(encodeSchema(schema)))
// true
The types
namespace also has methods implementing the subtype relation over types as well as for computing the infima and suprema operations over the induced partial order.
We can compare types with types.isSubtypeOf
and types.isEqualTo
.
declare namespace types {
function isSubtypeOf(x: Type, y: Type): boolean
function isEqualTo(x: Type, y: Type): boolean
}
The subtype relation (denoted ≤ in writing) is defined by cases:
- The URI type is a subtype of itself
- A literal type X is a subtype of a literal type Y if and only if X and Y have the same datatype
- A product type X is a subtype of the product type Y if and only if
- for every component key K in X, Y has a component with key K, and the type X(K) is a subtype of the type Y(K)
- A coproduct type X is a subtype of the coproduct type Y if and only if
- for every option key K in Y, X has an option with key K, and the type X(K) is a subtype of the type Y(K)
- A reference type X is a subtype of a reference type Y if and only if X and Y reference the same class
- If two types X and Y are of different kinds, then neither X ≤ Y nor Y ≤ X
Intuitively, a type X could be a subtype of a type Y if it is missing some product components and has some extra coproduct options but otherwise structurally matches Y.
import { types } from "tasl"
types.isSubtypeOf(types.uri(), types.uri()) // true
types.isSubtypeOf(types.uri(), types.string) // false
types.isSubtypeOf(
types.product({}),
types.product({ "http://schema.org/name": types.string })
) // true
types.isSubtypeOf(
types.product({ "http://schema.org/name": types.string }),
types.product({})
) // false
types.isSubtypeOf(
types.product({ "http://schema.org/name": types.string }),
types.product({ "http://schema.org/name": types.boolean })
) // false
types.isSubtypeOf(
types.product({ "http://schema.org/name": types.string }),
types.product({
"http://schema.org/name": types.product({
"http://schema.org/givenName": types.string,
"http://schema.org/familyName": types.string,
}),
})
) // false
types.isSubtypeOf(
types.product({
"http://schema.org/gender": types.coproduct({
"http://schema.org/Male": types.unit,
"http://schema.org/Female": types.unit,
"http://schema.org/value": types.string,
}),
}),
types.product({
"http://schema.org/gender": types.coproduct({
"http://schema.org/Male": types.unit,
"http://schema.org/Female": types.unit,
}),
})
) // true
types.isSubtypeOf(
types.product({
"http://schema.org/gender": types.coproduct({
"http://schema.org/Male": types.unit,
"http://schema.org/Female": types.unit,
}),
}),
types.product({
"http://schema.org/gender": types.coproduct({
"http://schema.org/Male": types.unit,
"http://schema.org/Female": types.unit,
"http://schema.org/value": types.string,
}),
})
) // false
types.isSubtypeOf(
types.product({
"http://schema.org/author": types.reference("http://schema.org/Person"),
}),
types.product({
"http://schema.org/name": types.string,
"http://schema.org/author": types.reference("http://schema.org/Person"),
})
) // true
The subtype relation is reflexive (X ≤ X), transitive (if X ≤ Y and Y ≤ Z then X ≤ Z), and antisymmetric (if X ≤ Y and Y ≤ X then X = Y), which means the subtype relation forms a preorder over types. Every two types X and Y are related in one of four ways:
- X is a strict subtype of Y ((X ≤ Y) ∧ ¬(Y ≤ X))
- Y is a strict subtype of X ((Y ≤ X) ∧ ¬(X ≤ Y))
- X and Y are equal ((X ≤ Y) ∧ (Y ≤ X))
- X and Y are incomparable (¬(X ≤ Y) ∧ ¬(Y ≤ X))
types.isEqualTo(x, y)
is equivalent to types.isSubtypeOf(x, y) && types.isSubtypeOf(y, x)
.
Lastly, the types
namespace also has methods for computing the greatest common subtype and least common supertype of types with respect to the subtype relation. These are more formally known as infimum and supremum, respectively.
The greatest common subtype of types X and Y is a maximal type Z such that Z is a subtype of both X and Y. Conversely, the least common supertype of types X and Y is a minimal type Z such that X and Y are both subtypes of Z.
declare namespace types {
function hasCommonBounds(x: Type, y: Type): boolean
function greatestCommonSubtype(x: Type, y: Type): Type
function leastCommonSupertype(x: Type, y: Type): Type
}
In general, the infima and suprema of arbitrary types X and Y are not guaranteed to exist. The method types.hasCommonBounds
checks whether two types have an infimum and supremum (if they have one then they also have the other). types.greatestCommonSubtype
and types.leastCommonSupertype
will throw an error if called with types that do not have common bounds.
Intuitively, types.greatestCommonSubtype
and types.leastCommonSupertype
are two complementary ways of "merging" two types by either discarding extra product components and keeping extra coproduct options, or keeping extra product components and discarding extra coproduct options, respectively.
import { types } from "tasl"
types.greatestCommonSubtype(types.uri(), types.uri()) // { kind: "uri" }
types.leastCommonSupertype(types.uri(), types.uri()) // { kind: "uri" }
types.greatestCommonSubtype(
types.product({ "http://schema.org/name": types.string }),
types.product({ "http://schema.org/email": types.uri() })
) // { kind: 'product', components: {} }
types.leastCommonSupertype(
types.product({ "http://schema.org/name": types.string }),
types.product({ "http://schema.org/email": types.uri() })
)
// {
// kind: 'product',
// components: {
// 'http://schema.org/email': { kind: 'uri' },
// 'http://schema.org/name': {
// kind: 'literal',
// datatype: 'http://www.w3.org/2001/XMLSchema#string'
// }
// }
// }
types.greatestCommonSubtype(
types.coproduct({
"http://example.com/foo": types.unit,
"http://example.com/bar": types.unit,
}),
types.coproduct({
"http://example.com/foo": types.unit,
"http://example.com/baz": types.unit,
})
)
// {
// kind: 'coproduct',
// options: {
// 'http://example.com/baz': { kind: 'product', components: {} },
// 'http://example.com/foo': { kind: 'product', components: {} },
// 'http://example.com/bar': { kind: 'product', components: {} }
// }
// }
types.leastCommonSupertype(
types.coproduct({
"http://example.com/foo": types.unit,
"http://example.com/bar": types.unit,
}),
types.coproduct({
"http://example.com/foo": types.unit,
"http://example.com/baz": types.unit,
})
)
// {
// kind: 'coproduct',
// options: { 'http://example.com/foo': { kind: 'product', components: {} } }
// }
types.greatestCommonSubtype(types.string, types.boolean)
// Uncaught Error: cannot unify unequal literal types
types.greatestCommonSubtype(
types.product({ "http://schema.org/name": types.string }),
types.product({
"http://schema.org/name": types.product({
"http://schema.org/givenName": types.string,
"http://schema.org/familyName": types.string,
}),
})
)
// Uncaught Error: cannot unify types of different kinds
The operations types.greatestCommonSubtype
and types.leastCommonSupertype
are both associative and commutative. The relation types.hasCommonBounds
is reflexive and symmetric, but not necessarily transitive.
If X ≤ Y then their greatest common subtype is X and least common supertype is Y. There are many situations where types that are incomparable (neither X ≤ Y nor Y ≤ X) do have common bounds - types.hasCommonBounds(x, y)
is not equivalent to types.isSubtypeOf(x, y) || types.isSubtypeOf(y, x)
.