Skip to content

Commit 6cb37bf

Browse files
rlyhrshdhgd
andauthored
Add initial docs (#3)
Co-authored-by: Harshad Hegde <[email protected]>
1 parent 46b66a1 commit 6cb37bf

File tree

7 files changed

+714
-14
lines changed

7 files changed

+714
-14
lines changed

Diff for: .github/workflows/deploy-docs.yml

+6-2
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,10 @@ name: Auto-deployment of Documentation
22
on:
33
push:
44
branches: [ main ]
5+
# pull_request:
6+
# branches: [ main ]
7+
workflow_dispatch:
8+
59
jobs:
610
build-docs:
711
runs-on: ubuntu-latest
@@ -31,7 +35,7 @@ jobs:
3135
cd docs/
3236
poetry run sphinx-apidoc -o . ../src/linkml_arrays/ --ext-autodoc -f
3337
poetry run sphinx-build -b html . _build
34-
cp -r _build/html/* ../gh-pages/
38+
cp -r _build/* ../gh-pages/
3539
3640
- name: Deploy documentation.
3741
if: ${{ github.event_name == 'push' }}
@@ -40,4 +44,4 @@ jobs:
4044
branch: gh-pages
4145
force: true
4246
folder: gh-pages
43-
token: ${{ secrets.GH_TOKEN }}
47+
token: ${{ secrets.GH_TOKEN }}

Diff for: docs/conf.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,8 @@
2323
"sphinx.ext.githubpages",
2424
"sphinx_rtd_theme",
2525
"sphinx_click",
26-
"sphinx_autodoc_typehints",
27-
"myst_parser",
26+
# "sphinx_autodoc_typehints",
27+
"myst_parser"
2828
]
2929

3030
# generate autosummary pages

Diff for: docs/examples.rst

+251
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,251 @@
1+
.. _examples:
2+
3+
-------------
4+
Example Usage
5+
-------------
6+
7+
Given a LinkML schema such as the following:
8+
https://github.com/linkml/linkml-arrays/blob/main/tests/input/temperature_dataset.yaml
9+
10+
We can generate Pydantic classes for the schema:
11+
https://github.com/linkml/linkml-arrays/blob/main/tests/test_dumpers/array_classes.py
12+
13+
We can then create instances of these classes to represent data:
14+
15+
.. code:: python
16+
17+
import numpy as np
18+
from tests.test_dumpers.array_classes import (
19+
LatitudeSeries, LongitudeSeries, DaySeries,
20+
TemperatureMatrix, TemperatureDataset
21+
)
22+
23+
latitude_in_deg = LatitudeSeries(values=np.array([1, 2, 3]))
24+
longitude_in_deg = LongitudeSeries(values=np.array([4, 5, 6]))
25+
time_in_d = DaySeries(values=np.array([7, 8, 9]))
26+
temperatures_in_K = TemperatureMatrix(
27+
values=np.ones((3, 3, 3)),
28+
)
29+
temperature = TemperatureDataset(
30+
name="my_temperature",
31+
latitude_in_deg=latitude_in_deg,
32+
longitude_in_deg=longitude_in_deg,
33+
time_in_d=time_in_d,
34+
temperatures_in_K=temperatures_in_K,
35+
)
36+
37+
^^^^^^^^^^^^^
38+
Serialization
39+
^^^^^^^^^^^^^
40+
41+
We currently have four options for serializing (dumper) these arrays to disk:
42+
43+
1. a YAML file for the non-array data and a NumPy file for each of the arrays
44+
2. a YAML file for the non-array data and an HDF5 file with a single dataset for each of the arrays
45+
3. a single HDF5 file with a hierarchical structure that mirrors the structure of the data object and contains
46+
non-array data as attributes and array data as datasets
47+
4. a single Zarr (v2) directory store with a hierarchical structure that mirrors the structure of the data object and
48+
contains non-array data as attributes and array data as arrays
49+
50+
For all dumpers, first get a ``SchemaView`` object for the LinkML schema:
51+
52+
.. code:: python
53+
54+
from linkml_runtime import SchemaView
55+
from pathlib import Path
56+
57+
schema_path = Path("temperature_dataset.yaml")
58+
schemaview = SchemaView(schema_path)
59+
60+
Then use a dumper to serialize the ``TemperatureDataset`` data object that we created above:
61+
62+
YAML + NumPy dumper:
63+
64+
.. code:: python
65+
66+
from linkml_arrays.dumpers import YamlNumpyDumper
67+
YamlNumpyDumper().dumps(temperature, schemaview=schemaview)
68+
69+
Output YAML file with references to the NumPy files for each array:
70+
71+
.. code:: yaml
72+
73+
latitude_in_deg:
74+
values: file:./my_temperature.LatitudeSeries.values.npy
75+
longitude_in_deg:
76+
values: file:./my_temperature.LongitudeSeries.values.npy
77+
name: my_temperature
78+
temperatures_in_K:
79+
values: file:./my_temperature.TemperatureMatrix.values.npy
80+
time_in_d:
81+
values: file:./my_temperature.DaySeries.values.npy
82+
83+
YAML + HDF5 dumper:
84+
85+
.. code:: python
86+
87+
from linkml_arrays.dumpers import YamlHdf5Dumper
88+
YamlHdf5Dumper().dumps(temperature, schemaview=schemaview)
89+
90+
Output YAML file with references to the HDF5 files for each array:
91+
92+
.. code:: yaml
93+
94+
latitude_in_deg:
95+
values: file:./my_temperature.LatitudeSeries.values.h5
96+
longitude_in_deg:
97+
values: file:./my_temperature.LongitudeSeries.values.h5
98+
name: my_temperature
99+
temperatures_in_K:
100+
values: file:./my_temperature.TemperatureMatrix.values.h5
101+
time_in_d:
102+
values: file:./my_temperature.DaySeries.values.h5
103+
104+
HDF5 dumper:
105+
106+
.. code:: python
107+
108+
from linkml_arrays.dumpers import Hdf5Dumper
109+
Hdf5Dumper().dumps(temperature, schemaview=schemaview)
110+
111+
The ``h5dump`` output of the resulting HDF5 file:
112+
113+
.. code::
114+
115+
HDF5 "my_temperature.h5" {
116+
GROUP "/" {
117+
ATTRIBUTE "name" {
118+
DATATYPE H5T_STRING {
119+
STRSIZE H5T_VARIABLE;
120+
STRPAD H5T_STR_NULLTERM;
121+
CSET H5T_CSET_UTF8;
122+
CTYPE H5T_C_S1;
123+
}
124+
DATASPACE SCALAR
125+
DATA {
126+
(0): "my_temperature"
127+
}
128+
}
129+
GROUP "latitude_in_deg" {
130+
DATASET "values" {
131+
DATATYPE H5T_STD_I64LE
132+
DATASPACE SIMPLE { ( 3 ) / ( 3 ) }
133+
DATA {
134+
(0): 1, 2, 3
135+
}
136+
}
137+
}
138+
GROUP "longitude_in_deg" {
139+
DATASET "values" {
140+
DATATYPE H5T_STD_I64LE
141+
DATASPACE SIMPLE { ( 3 ) / ( 3 ) }
142+
DATA {
143+
(0): 4, 5, 6
144+
}
145+
}
146+
}
147+
GROUP "temperatures_in_K" {
148+
DATASET "values" {
149+
DATATYPE H5T_IEEE_F64LE
150+
DATASPACE SIMPLE { ( 3, 3, 3 ) / ( 3, 3, 3 ) }
151+
DATA {
152+
(0,0,0): 1, 1, 1,
153+
(0,1,0): 1, 1, 1,
154+
(0,2,0): 1, 1, 1,
155+
(1,0,0): 1, 1, 1,
156+
(1,1,0): 1, 1, 1,
157+
(1,2,0): 1, 1, 1,
158+
(2,0,0): 1, 1, 1,
159+
(2,1,0): 1, 1, 1,
160+
(2,2,0): 1, 1, 1
161+
}
162+
}
163+
}
164+
GROUP "time_in_d" {
165+
DATASET "values" {
166+
DATATYPE H5T_STD_I64LE
167+
DATASPACE SIMPLE { ( 3 ) / ( 3 ) }
168+
DATA {
169+
(0): 7, 8, 9
170+
}
171+
}
172+
}
173+
}
174+
}
175+
176+
Zarr dumper:
177+
178+
.. code:: python
179+
180+
from linkml_arrays.dumpers import ZarrDumper
181+
ZarrDumper().dumps(temperature, schemaview=schemaview)
182+
183+
The ``tree`` output of the resulting Zarr directory store:
184+
185+
.. code::
186+
187+
my_temperature.zarr
188+
├── .zattrs
189+
├── .zgroup
190+
├── latitude_in_deg
191+
│ ├── .zgroup
192+
│ └── values
193+
│ ├── .zarray
194+
│ └── 0
195+
├── longitude_in_deg
196+
│ ├── .zgroup
197+
│ └── values
198+
│ ├── .zarray
199+
│ └── 0
200+
├── temperatures_in_K
201+
│ ├── .zgroup
202+
│ └── values
203+
│ ├── .zarray
204+
│ └── 0.0.0
205+
└── time_in_d
206+
├── .zgroup
207+
└── values
208+
├── .zarray
209+
└── 0
210+
211+
^^^^^^^^^^^^^^^
212+
Deserialization
213+
^^^^^^^^^^^^^^^
214+
215+
For deserializing (loading) the data, we can use the corresponding loader for each dumper:
216+
217+
YAML + NumPy loader:
218+
219+
.. code:: python
220+
221+
from hbreader import hbread
222+
from linkml_arrays.loaders import YamlNumpyLoader
223+
224+
read_yaml = hbread("my_temperature_yaml_numpy.yaml")
225+
read_temperature = YamlNumpyLoader().loads(read_yaml, target_class=TemperatureDataset, schemaview=schemaview)
226+
227+
YAML + HDF5 loader:
228+
229+
.. code:: python
230+
231+
from hbreader import hbread
232+
from linkml_arrays.loaders import YamlHdf5Loader
233+
234+
read_yaml = hbread("my_temperature_yaml_hdf5.yaml")
235+
read_temperature = YamlHdf5Loader().loads(read_yaml, target_class=TemperatureDataset, schemaview=schemaview)
236+
237+
HDF5 loader:
238+
239+
.. code:: python
240+
241+
from linkml_arrays.loaders import Hdf5Loader
242+
243+
read_temperature = Hdf5Loader().loads("my_temperature.h5", target_class=Temperature
244+
245+
Zarr loader:
246+
247+
.. code:: python
248+
249+
from linkml_arrays.loaders import ZarrLoader
250+
251+
read_temperature = ZarrLoader().loads("my_temperature.zarr", target_class=Temperature

Diff for: docs/index.rst

+4-1
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,14 @@
66
Welcome to linkml-arrays's documentation!
77
=========================================================
88

9+
linkml-arrays is a Python package for working with arrays in LinkML models.
10+
11+
912
.. toctree::
1013
:maxdepth: 2
1114
:caption: Contents:
1215

13-
modules
16+
examples
1417

1518
Indices and tables
1619
==================

0 commit comments

Comments
 (0)