You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Add JSONMatcher class
* Embed JSONMatcher into the backends
* First attempt: Dataset-specific configuration
* Seems to work
* Adapt Coretest to new output of myPath()
* Better error messages and some documentation inside example
* Adapt constructors for installation without ADIOS2/HDF5
* CI fixes
* Basic implementation
* Update documentation and tests
* Add a JSON translation of the config for NVHPC compilers
might also be good for documentation purposes as JSON is more widely
known
* Use dataset-specific config in tests
* Fix: do_prune parameter for merge()
* Rename merge() -> merge_internal()
Having the same name as the public function provoked errors due to
conversion from nlohmann::json types.
* Don't compute the matchers for all backends
* Add default block to test configs
* Documentation
* Add TOML example
* Add Python binding for openPMD_path
* Fix doxygen
* Read dataset-specific configuration also in ADIOS2::openDataset
* Cleanup
* Fix initialization from Dummy IO Handler
* Fix Doxygen
* Documentation Update
* Fix NVCOMPILER macro in example
---------
Co-authored-by: Axel Huebl <[email protected]>
Copy file name to clipboardexpand all lines: docs/source/details/backendconfig.rst
+75
Original file line number
Diff line number
Diff line change
@@ -287,3 +287,78 @@ Explanation of the single keys:
287
287
In "template" mode, only the dataset metadata (type, extent and attributes) are stored and no chunks can be written or read (i.e. write/read operations will be skipped).
288
288
* ``json.attribute.mode`` / ``toml.attribute.mode``: One of ``"long"`` (default in openPMD 1.*) or ``"short"`` (default in openPMD 2.* and generally in TOML).
289
289
The long format explicitly encodes the attribute type in the dataset on disk, the short format only writes the actual attribute as a JSON/TOML value, requiring readers to recover the type.
290
+
291
+
Dataset-specific configuration
292
+
------------------------------
293
+
294
+
Sometimes it is beneficial to set configuration options for specific datasets.
295
+
Most dataset-specific configuration options supported by the openPMD-api are additionally backend-specific, being format-specific serialization instructions such as compression or chunking.
296
+
297
+
All dataset-specific and backend-specific configuration is specified under the key path ``<backend>.dataset``.
298
+
Without filtering by dataset name (see the ``select``` key below) this looks like:
299
+
300
+
.. code-block:: json
301
+
302
+
{
303
+
"adios2": {
304
+
"dataset": {
305
+
"operators": []
306
+
}
307
+
},
308
+
"hdf5": {
309
+
"dataset": {
310
+
"chunking": "auto"
311
+
}
312
+
}
313
+
}
314
+
315
+
Dataset-specific configuration options can be configured in multiple ways:
316
+
317
+
As part of the general JSON/TOML configuration
318
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
319
+
320
+
In the simplest case, the dataset configuration is specified without any extra steps as part of the JSON/TOML configuration that is used to initialize the openPMD Series as part of the ``Series`` constructor. This does not allow specifying different configurations per dataset, but sets the default configuration for all datasets.
321
+
322
+
As a separate JSON/TOML configuration during dataset initialization
Similarly to the ``Series`` constructor, the ``Dataset`` constructor optionally receives a JSON/TOML configuration, used for setting options specifically only for those datasets initialized with this ``Dataset`` specification. The default given in the ``Series`` constructor will be overridden.
326
+
327
+
This is the preferred way for configuring dataset-specific options that are *not* backend-specific (currently only ``{"resizable": true}``).
328
+
329
+
By pattern-matching the dataset names
330
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
331
+
332
+
The above approach has the disadvantage that it has to be supported explicitly at the level of the downstream application, e.g. a simulation or data reader. As an alternative, the the backend-specific dataset configuration under ``<backend>.dataset`` can also be given as a list of alternatives that are matched against the dataset name in sequence, e.g. ``hdf5.dataset = [<pattern_1>, <pattern_2>, ...]``.
333
+
334
+
Each such pattern ``<pattern_i>`` is a JSON object with key ``cfg`` and optional key ``select``: ``{"select": <regex>, "cfg": <cfg>}``.
335
+
336
+
In here, ``<regex>`` is a regex or a list of regexes, of type egrep as defined by the `C++ standard library <https://en.cppreference.com/w/cpp/regex/basic_regex/constants>`__.
337
+
``<cfg>`` is a configuration that will be forwarded as a "regular" dataset configuration to the backend.
338
+
339
+
.. note::
340
+
341
+
To match lists of regular expressions ``select = [REGEX_1, REGEX_2, ..., REGEX_n]``, the list is internally transformed into a single regular expression ``($^)|(REGEX_1)|(REGEX_2)|...|(REGEX_n)``.
342
+
343
+
In a configuration such as ``hdf5.dataset = [<pattern_1>, <pattern_2>, ...]``, the single patterns will be processed in top-down manner, selecting the first matching pattern found in the list.
344
+
The specified regexes will be matched against the openPMD dataset path either within the Iteration (e.g. ``meshes/E/x`` or ``particles/.*/position/.*``) or within the Series (e.g. ``/data/1/meshes/E/x`` or ``/data/.*/particles/.*/position/.*``), considering full matches only.
345
+
346
+
.. note::
347
+
348
+
The dataset name is determined by the result of ``attributable.myPath().openPMDPath()`` where ``attributable`` is an object in the openPMD hierarchy.
349
+
350
+
.. note::
351
+
352
+
To match against the path within the containing Iteration or within the containing Series, the specified regular expression is internally transformed into ``(/data/[0-9]+/)?(REGEX)`` where ``REGEX`` is the specified pattern, and then matched against the full dataset path.
353
+
354
+
The **default configuration** is specified by omitting the ``select`` key.
355
+
Specifying more than one default is an error.
356
+
If no pattern matches a dataset, the default configuration is chosen if specified, or an empty JSON object ``{}`` otherwise.
0 commit comments