Skip to content
This repository has been archived by the owner on Jun 21, 2022. It is now read-only.

Array types

Jim Pivarski edited this page Oct 11, 2018 · 28 revisions

The following are high-level types, i.e. distinguishable to users:

  • Shapes and dtypes: arrays are functions from fixed-size integer domains to dtypes (or to other such functions), as defined by Numpy.
  • Jaggedness: variable-size integer domains.
  • Tables: enumerated string domains corresponding to column names, which behaves like Numpy's recarray (though Tables do not need to be contiguous in memory or an array-of-structs like recarrays). (This is a product type.)
  • Unions: enumerated types in a tagged union, for polymorphism. (This is a sum type.)
  • Options: tagged union of value or the singleton numpy.ma.masked (configurable). (This is an idempotent sum type.)
  • Cross-references: explicit call-outs when the contents of one array is a cousin or ancestor in the tree.

Not all array types correspond to distinct high-level types. Some exist to provide low-level features.

Any of the above can have Methods mixed-in from dataless classes (like Java Interfaces) to give them domain-specific behaviors, such as physics calculations.

Only two array types are mutable in the sense of changing data (not metadata): Tables, which can add/remove columns, and AppendableArray, which can append/extend data at the end of the array. All types have mutable metadata.

Array type Purpose Type
JaggedArray representing an array of different-length subarrays, indexed as elements [0, inf) → X
ByteJaggedArray representing an array of different-length subarrays, indexed as bytes [0, inf) → X
Table representing a table of different-typed columns; mutable: columns can be added or removed Table(X, Y)
UnionArray tagging and indexing elements in arrays to simulate a heterogeneous array Union(X, Y)
MaskedArray byte-masking elements of an array as N/A Option(X)
BitMaskedArray bit-masking elements of an array as N/A Option(X)
IndexedMaskedArray indexes elements of an array as N/A or as a pointer (content is sparse) Option(X)
ObjectArray generates objects in an array upon access dtype=object
IndexedArray indexes elements of an array as a pointer by index position X
ByteIndexedArray indexes elements of an array as a pointer by byte position X
SparseArray represents an array storing only elements whose values is not a default value (such as zero); in a sense, the opposite of IndexedArray X
ChunkedArray logically concatenates discontiguous chunks into one big array; chunk sizes might be known or unknown; appendable X
AppendableArray allocates array chunks to append or extend a Numpy array; mutable: number of rows can grow; restricted: Numpy content only X
VirtualArray generates an array upon first access, then caches X
Clone this wiki locally