Skip to content

Auto File Management Part 1: Introducing a Datafile Management Tool #235

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 11 additions & 13 deletions doc/source/Python.rst
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,17 @@ To make sure everything is installed properly, you can try invoking pygrackle fr

If this command executes without raising any errors, then you have successfully installed Pygrackle.

Installing DataFiles
++++++++++++++++++++

To install the datafiles in a location usable for automatic usage in the Pygrackle examples (and tests) we recommend invoking the following command (from any directory):

.. code-block:: shell-session

$ python -m pygrackle fetch

:ref:`This section <manage-data-files>` for more details about customizing the the location where data is stored and about managing datafiles in general.

.. _pygrackle-dev:

Installing Pygrackle Development Requirements
Expand Down Expand Up @@ -185,9 +196,6 @@ a parcel of gas at constant density or in a free-fall model. Each example
will produce a figure as well as a dataset that can be loaded and analyzed
with `yt <http://yt-project.org/>`__.

Editable Install Requirement
++++++++++++++++++++++++++++

All of the example scripts discussed below use the following line to
make a guess at where the Grackle input files are located.

Expand All @@ -200,16 +208,6 @@ make a guess at where the Grackle input files are located.

from pygrackle.utilities.data_path import grackle_data_dir

This currently **ONLY** works for an 'editable' Pygrackle installation
(i.e., one installed with ``pip install -e .`` as directed
above). In this case, it will be assumed that the data files can be
found in a directory called ``input`` in the top level of the source
repository.

.. note::

`GitHub PR #235 <https://github.com/grackle-project/grackle/pull/235>`__ is a pending pull request that seeks to add functionality to make this work in a regular Pygrackle installation (i.e. a non-'editable' install).

Cooling Rate Figure Example
+++++++++++++++++++++++++++

Expand Down
75 changes: 75 additions & 0 deletions doc/source/Tools.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@

.. _manage-data-files:

Datafile Management
===================

We provide a command line tool to optionally manage Grackle's datafiles.

At a Quick Glance
-----------------

Currently, this command line tool is only accessible when :ref:`pygrackle is installed <install-pygrackle>`.
To execute the tool execute

.. code-block:: shell-session

$ python -m pygrackle <args>...

Where ``<args>...`` is replaced with one or more command-line arguments.
For example, ``fetch`` will invoke a subcommand that downloads all associated files (if they aren't already downloaded).
You can use the ``--help`` option to get a list of all subcommands.
You can also pass the ``--help`` option after the name of a subcommand (e.g. you can use ``fetch --help``) to get more details about subcommand-specific options.

.. note::

At the moment, this functionality is most useful for pygrackle.
In the near future [#df1]_\ , it will be possible install pygrackle without manually downloading the grackle repository.
At that time, this will be the most efficient way to retrieve the files.
The pygrackle examples and some of the pygrackle tests rely upon this functionality.
However, you are free to completely ignore this functionality for your own purposes.

There is ongoing work to implement functionality for the Grackle C library to directly access the datafiles managed by this tool.
When these efforts are finished, we plan to additionally provide this command-line-tool as a standalone program that is always installed alongside Grackle (so that you can access this functionality without installing pygrackle)

Description
-----------

.. include:: ../../src/python/pygrackle/utilities/grdata.py
:start-after: [[[BEGIN-SECTION:DESCRIPTION]]]
:end-before: [[[END-SECTION:DESCRIPTION]]]



Motivation
----------

.. include:: ../../src/python/pygrackle/utilities/grdata.py
:start-after: [[[BEGIN-SECTION:MOTIVATION]]]
:end-before: [[[END-SECTION:MOTIVATION]]]


How it works
------------

.. include:: ../../src/python/pygrackle/utilities/grdata.py
:start-after: [[[BEGIN-SECTION:INTERNALS-OVERVIEW]]]
:end-before: [[[END-SECTION:INTERNALS-OVERVIEW]]]


Sample Directory Structure
++++++++++++++++++++++++++

Down below, we sketch out what the directory-structure might look like:


.. literalinclude:: ../../src/python/pygrackle/utilities/grdata.py
:language: none
:start-after: [[[BEGIN:DIRECTORY-CARTOON]]]
:end-before: [[[END:DIRECTORY-CARTOON]]]


.. rubric:: Footnotes

.. [#df1] Once `GH-#208 <https://github.com/grackle-project/grackle/pull/208>`__ is merged, you will be able to instruct pip to install pygrackle by just specifying the URL of the GitHub repository.
We also have plans to upload pygrackle to pip.
1 change: 1 addition & 0 deletions doc/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,7 @@ To help you start using Grackle, we provide:
Reference.rst
Versioning.rst
Python.rst
Tools.rst
Conduct.rst
Contributing.rst
Help.rst
Expand Down
3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,8 @@ dependencies = [
'h5py',
'numpy',
'matplotlib',
'yt>=4.0.2'
'yt>=4.0.2',
"importlib_resources;python_version<'3.9'"
]

[project.license]
Expand Down
12 changes: 12 additions & 0 deletions src/python/pygrackle/__main__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
import sys

from .utilities.grdata import main as grdata_main
from .utilities.data_path import _make_config


def main(args=None):
return grdata_main(_make_config(), prog_name="python -m pygrackle", args=args)


if __name__ == "__main__":
sys.exit(main())
Empty file.
15 changes: 15 additions & 0 deletions src/python/pygrackle/file_registry/file_registry.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
// This is a file registry generated by the grackle data management tool
// To overwrite this file with an updated copy (assuming that pygrackle is
// installed), you might invoke:
// python -m pygrackle --hash_name sha1 --output <outpath> <dir>
// in this sample command, you would substitute:
// -> ``<outpath>`` with a path to the output file
// -> ``<dir>`` with a path to the directory containing all files that are
// to be included in the registry
{"CloudyData_UVB=FG2011.h5", "sha1:5b3423fb5cb96d6f8fae65655e204f1f82a276fa"},
{"CloudyData_UVB=FG2011_shielded.h5", "sha1:60d13b4632f074fcb295f7adea85843046c0d4ef"},
{"CloudyData_UVB=HM2012.h5", "sha1:3ae95f71926aa9543964fbd41c5e53a42345c19c"},
{"CloudyData_UVB=HM2012_high_density.h5", "sha1:6db93abf8cb818975e8d751776328c5dab44d4ee"},
{"CloudyData_UVB=HM2012_shielded.h5", "sha1:16cab5b5bd0bf5ef87db717dd5e8901be11812c2"},
{"CloudyData_noUVB.h5", "sha1:55fed7c4bfd10e35d60660ca1adc5ceb411befb2"},
{"cloudy_metals_2008_3D.h5", "sha1:ade563216d1102e8befab822cbb60c418b130aa1"}
51 changes: 43 additions & 8 deletions src/python/pygrackle/utilities/data_path.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,17 +11,52 @@
# software.
########################################################################

import io
import os
import sys

from pygrackle.__config__ import _is_editable_installation
from pygrackle.grackle_wrapper import get_grackle_version
from pygrackle.utilities.grdata import (
fetch_all, make_config_object, _datastoredir_and_versiondir
)
from pygrackle.utilities.misc import dirname

if _is_editable_installation():
# Note, this only works with an editable install of pygrackle.
_install_dir = dirname(os.path.abspath(__file__), level=5)
grackle_data_dir = os.path.join(_install_dir, "input")
else:
raise RuntimeError(
"in non-editable pygrackle installations, like this one, "
f"grackle_data_dir cannot be imported from {__file__}."

def _get_file_registry_contents():
if _is_editable_installation():
fname = os.path.join(
dirname(os.path.abspath(__file__), 2), "file_registry", "file_registry.txt"
)
if not os.path.isfile(fname):
raise RuntimeError(
"could not find the file_registry.txt in an editable install."
)
return fname

if (sys.version_info.major, sys.version_info.minor) < (3, 9):
import importlib_resources as resources
else:
from importlib import resources
ref = resources.files("pygrackle.file_registry") / "file_registry.txt"

contents = ref.read_text(encoding="utf-8")
return io.StringIO(contents)


def _make_config(grackle_version=None):
if grackle_version is None:
grackle_version = get_grackle_version()["version"]
return make_config_object(
grackle_version=grackle_version,
file_registry_file=_get_file_registry_contents(),
)


_CONFIG = _make_config()
grackle_data_dir = _datastoredir_and_versiondir(_CONFIG)[1]


def _download_all_datafiles():
"""Download all datafiles if it hasn't been downloaded already."""
return fetch_all(_CONFIG)
Loading