This repository has been archived by the owner on Jun 21, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 39
Array types
Jim Pivarski edited this page Oct 11, 2018
·
28 revisions
The following are high-level types, i.e. distinguishable to users:
- Shapes and dtypes: arrays are functions from fixed-size integer domains to dtypes (or to other such functions), as defined by Numpy.
- Jaggedness: variable-size integer domains.
- Tables: enumerated string domains corresponding to column names, which behaves like Numpy's recarray (though Tables do not need to be contiguous in memory or an array-of-structs like recarrays). (This is a product type.)
- Unions: enumerated types in a tagged union, for polymorphism. (This is a sum type.)
-
Options: tagged union of value or the singleton
numpy.ma.masked
(configurable). (This is an idempotent sum type.) - Cross-references: explicit call-outs when the contents of one array is a cousin or ancestor in the tree.
Not all array types correspond to distinct high-level types. Some exist to provide low-level features.
Any of the above can have Methods
mixed-in from dataless classes (like Java Interfaces) to give them domain-specific behaviors, such as physics calculations.
Only two array types are mutable in the sense of changing data (not metadata): Tables, which can add/remove columns, and AppendableArray, which can append/extend data at the end of the array. All types have mutable metadata.
Array type | Purpose | Type |
---|---|---|
JaggedArray | representing an array of different-length subarrays, indexed as elements | [0, inf) → X |
ByteJaggedArray | representing an array of different-length subarrays, indexed as bytes | [0, inf) → X |
Table | representing a table of different-typed columns; mutable: columns can be added or removed | Table(X, Y) |
UnionArray | tagging and indexing elements in arrays to simulate a heterogeneous array | Union(X, Y) |
MaskedArray | byte-masking elements of an array as N/A | Option(X) |
BitMaskedArray | bit-masking elements of an array as N/A | Option(X) |
IndexedMaskedArray | indexes elements of an array as N/A or as a pointer (content is sparse) | Option(X) |
ObjectArray | generates objects in an array upon access | dtype=object |
IndexedArray | indexes elements of an array as a pointer by index position | X |
ByteIndexedArray | indexes elements of an array as a pointer by byte position | X |
SparseArray | represents an array storing only elements whose values is not a default value (such as zero); in a sense, the opposite of IndexedArray | X |
ChunkedArray | logically concatenates discontiguous chunks into one big array; chunk sizes might be known or unknown; appendable | X |
AppendableArray | allocates array chunks to append or extend a Numpy array; mutable: number of rows can grow; restricted: Numpy content only | X |
VirtualArray | generates an array upon first access, then caches | X |