Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a UAT registry constraint #649

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,10 @@ Enhancements and Fixes

- Change AsyncTAPJob.result to return None if no result is found explicitly [#644]

- Add a UAT constraint to the registry interface for constraining
subjects [#649]



Deprecations and Removals
-------------------------
Expand Down
8 changes: 4 additions & 4 deletions docs/dal/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ metadata.
.. doctest-remote-data::

>>> import pyvo as vo
>>> service = vo.dal.SIAService("http://dc.zah.uni-heidelberg.de/lswscans/res/positions/siap/siap.xml")
>>> service = vo.dal.SIAService("http://dc.g-vo.org/lswscans/res/positions/siap/siap.xml")
>>> print(service.description)
Scans of plates kept at Landessternwarte Heidelberg-Königstuhl. They
were obtained at location, at the German-Spanish Astronomical Center
Expand Down Expand Up @@ -473,7 +473,7 @@ Basic queries are done with the ``pos`` and ``size`` parameters described in

>>> pos = SkyCoord.from_name('Eta Carina')
>>> size = Quantity(0.5, unit="deg")
>>> sia_service = vo.dal.SIAService("http://dc.zah.uni-heidelberg.de/hppunion/q/im/siap.xml")
>>> sia_service = vo.dal.SIAService("http://dc.g-vo.org/hppunion/q/im/siap.xml")
>>> sia_results = sia_service.search(pos=pos, size=size)

The dataset format, 'all' by default, can be specified:
Expand Down Expand Up @@ -565,7 +565,7 @@ within a circular region on the sky defined by the parameters ``pos``

.. doctest-remote-data::

>>> scs_srv = vo.dal.SCSService('http://dc.zah.uni-heidelberg.de/arihip/q/cone/scs.xml')
>>> scs_srv = vo.dal.SCSService('http://dc.g-vo.org/arihip/q/cone/scs.xml')
>>> scs_results = scs_srv.search(pos=pos, radius=size)

This service exposes the :ref:`verbosity <pyvo-verbosity>` parameter.
Expand Down Expand Up @@ -754,7 +754,7 @@ If the row contains datasets, they are exposed by several retrieval methods:
.. doctest-skip::

>>> row.getdataurl()
'http://dc.zah.uni-heidelberg.de/getproduct/califa/datadr3/V500/NGC0551.V500.rscube.fits'
'http://dc.g-vo.org/getproduct/califa/datadr3/V500/NGC0551.V500.rscube.fits'
>>> type(row.getdataset())
<class 'urllib3.response.HTTPResponse'>

Expand Down
24 changes: 20 additions & 4 deletions docs/registry/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,8 @@ keyword arguments. The following constraints are available:
* :py:class:`~pyvo.registry.UCD` (``ucd``): constrain by one or more UCD
patterns; resources match when they serve columns having a matching
UCD (e.g., ``phot.mag;em.ir.%`` for “any infrared magnitude”).
* :py:class:`~pyvo.registry.UAT` (``uat``): constrain by concepts
from the IVOA Unified Astronomy Thesaurus http://www.ivoa.net/rdf/uat.
* :py:class:`~pyvo.registry.Waveband` (``waveband``): one or more terms
from the vocabulary at http://www.ivoa.net/rdf/messenger giving the rough
spectral location of the resource.
Expand Down Expand Up @@ -97,9 +99,22 @@ or:
... registry.Waveband("UV"))

or a mixture between the two. Constructing using explicit
constraints is generally preferable with more complex queries. Where
the constraints accept multiple arguments, you can pass in sequences to
the keyword arguments; for instance:
constraints is generally preferable with more complex queries.
An advantage of using explicit constraints is that you can pass
additional parameters to the constraints. For instance, the UAT
constraint can optionally expand your keyword to narrower or wider
concepts. When looking for resources talking about Cepheids of all
kinds, you can thus say:

.. doctest-remote-data::

>>> resources = registry.search(
... registry.UAT("cepheid-variable-stars", expand_down=3))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow this throws an error, even though cepheid-variable-stars are in the list on the referred website

_____________________________ [doctest] index.rst ______________________________
102 constraints is generally preferable with more complex queries.
103 An advantage of using explicit constraints is that you can pass
104 additional parameters to the constraints.  For instance, the UAT
105 constraint can optionally expand your keyword to narrower or wider
106 concepts.  When looking for resources talking about Cepheids of all
107 kinds, you can thus say:
108 
109 .. doctest-remote-data::
110 
111   >>> resources = registry.search(
UNEXPECTED EXCEPTION: DALQueryError: cepheid-variable-stars does not identify an IVOA uat concept (see http://www.ivoa.net/rdf/uat).
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/doctest.py", line 1355, in __run
    exec(compile(example.source, filename, "single",
  File "<doctest index.rst[4]>", line 2, in <module>
  File "/home/runner/work/pyvo/pyvo/.tox/py311-test-alldeps-online/lib/python3.11/site-packages/pyvo/registry/rtcons.py", line 713, in __init__
    raise dalq.DALQueryError(
pyvo.dal.exceptions.DALQueryError: cepheid-variable-stars does not identify an IVOA uat concept (see http://www.ivoa.net/rdf/uat).
/home/runner/work/pyvo/pyvo/docs/registry/index.rst:111: UnexpectedException
=========================== short test summary info ============================


There is no way to express this using keyword arguments.

However, where the constraints accept multiple equivalent arguments, you
can pass in sequences to the keyword arguments; for instance:

.. doctest-remote-data::

Expand All @@ -113,6 +128,7 @@ is equivalent to:
>>> resources = registry.search(waveband=["Radio", "Millimeter"],
... author='%Miller%')


There is also :py:meth:`~pyvo.registry.get_RegTAP_query`, accepting the
same arguments as :py:meth:`pyvo.registry.search`. This function simply
returns the ADQL query that search would execute. This is may be useful
Expand Down Expand Up @@ -154,7 +170,7 @@ interactive data discovery, however, it is usually preferable to use the

And to look for tap resources *in* a specific cone, you would do

.. doctest-remote-data::
.. doctest-remote-data:: # doctest: +IGNORE_OUTPUT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work, the #doctest commands should be added at the end of the lines (each one of them) that are affected. But this example already has it.


>>> from astropy.coordinates import SkyCoord
>>> registry.search(registry.Freetext("Wolf-Rayet"),
Expand Down
5 changes: 3 additions & 2 deletions pyvo/registry/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,12 @@

from .rtcons import (Constraint, SubqueriedConstraint,
Freetext, Author, Servicetype, Waveband, Datamodel, Ivoid,
UCD, Spatial, Spectral, Temporal, RegTAPFeatureMissing)
UCD, UAT, Spatial, Spectral, Temporal,
RegTAPFeatureMissing)

__all__ = ["search", "get_RegTAP_query", "Constraint", "SubqueriedConstraint",
"Freetext", "Author",
"Servicetype", "Waveband", "Datamodel", "Ivoid", "UCD",
"Spatial", "Spectral", "Temporal",
"UAT", "Spatial", "Spectral", "Temporal",
"choose_RegTAP_service", "RegTAPFeatureMissing",
"RegistryResults", "RegistryResource",]
87 changes: 87 additions & 0 deletions pyvo/registry/rtcons.py
Original file line number Diff line number Diff line change
Expand Up @@ -637,6 +637,93 @@ def __init__(self, *patterns):
for index, pattern in enumerate(patterns)}


class UAT(SubqueriedConstraint):
"""
A constraint selecting resources having UAT keywords as subjects.

The UAT (Unified Astronomy Thesaurus) is a hierarchical system
of concepts in astronomy. In the VO, its concept identifiers
are dashed strings, something like ``x-ray-transient-sources``.
The full list of identifiers is available from
http://www.ivoa.net/rdf/uat.

Note that not all data providers properly use UAT keywords in their
subjects even in 2025 (they should, though), and their keyword
assignments may not always be optimal. Consider doing free
text searches if UAT-based results are disappointing, and then
telling the respective data providers about missing keywords.
"""
_keyword = "uat"
_subquery_table = "rr.res_subject"
_uat = None

@classmethod
def _expand(cls, term, level, attribute):
"""
Recursively expand term in the uat.

This returns a set of concepts that are ``level`` levels wider
or narrower (depending on the value of ``attribute``) than term.

This function assumes the _uat class attribute has been filled
before; that is the case once a constraint has been constructed.

Parameters
----------

term: str
the start term
level: int
expand this many levels
attribute: str
either ``wider`` to expand towards more general concepts
or ``narrower`` to expand toward more specialised concepts.
"""
result = {term}
new_concepts = cls._uat[term][attribute]
if level:
for concept in new_concepts:
result |= cls._expand(concept, level-1, attribute)
return result

def __init__(self, uat_keyword, *, expand_up=0, expand_down=0):
"""

Parameters
----------

uat_keyword: str
An identifier from http://www.ivoa.net/rdf/uat, i.e., a
string like type-ib-supernovae. Note that these are
always all-lowercase.
expand_up: int
In addition to the concept itself, also include expand_up
levels of parent concepts (this is probably rarely makes
sense beyond 1).
expand_down: int
In addition to the concept itself, also include expand_down
levels of more specialised concepts (this is usually a good
idea; having more than 10 here for now is equivalent to
infinity).
"""
if self.__class__._uat is None:
self.__class__._uat = vocabularies.get_vocabulary("uat")["terms"]

if uat_keyword not in self._uat:
raise dalq.DALQueryError(
f"{uat_keyword} does not identify an IVOA uat"
" concept (see http://www.ivoa.net/rdf/uat).")

query_terms = {uat_keyword}
if expand_up:
query_terms |= self._expand(uat_keyword, expand_up, "wider")
if expand_down:
query_terms |= self._expand(uat_keyword, expand_down, "narrower")

self._condition = "res_subject in ({})".format(
", ".join(make_sql_literal(s) for s in sorted(query_terms)))


class Spatial(SubqueriedConstraint):
"""
A RegTAP constraint selecting resources covering a geometry in
Expand Down
43 changes: 43 additions & 0 deletions pyvo/registry/tests/commonfixtures.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
Common fixtures for pyVO registry tests
"""

import os

import pytest

from astropy.utils.data import (
Expand All @@ -26,6 +28,47 @@ def messenger_vocabulary(mocker):
package=__package__))


# The moc UAT was produced by this program:
# import json
# import pyvo
#
# def gather_children(voc, t):
# result = {t}
# voc[t].pop("description", None)
# for c in voc[t]["narrower"]:
# result |= gather_children(voc, c)
# return result
#
# uat = pyvo.utils.vocabularies.get_vocabulary("uat")
# solphys = gather_children(uat["terms"], "solar-physics")
# uat["terms"] = {t: m for t, m in uat["terms"].items()
# if t in solphys}
# with open("uat-selection.desise", "w", encoding="utf-8") as f:
# json.dump(uat, f, indent=1)

@pytest.fixture()
def uat_vocabulary(mocker):
"""a small sample of the IVOA UAT vocabulary in astropy's cache.

We need to clean up behind ourselves, because our version of the
UAT is limited to the solar-physics branch in order to not waste
too much space. The source code here contains a program to refresh
this vocabulary selection.
"""
voc_url = 'http://www.ivoa.net/rdf/uat'
import_file_to_cache(
voc_url,
get_pkg_data_filename(
'data/uat-selection.desise',
package=__package__))
yield
# it would be nice if we only did that if we polluted the
# cache before the yield, but we can't easily see if we did that.
cache_dir = _get_download_cache_loc('astropy')
cache_dirname = _url_to_dirname(voc_url)
local_dirname = os.path.join(cache_dir, cache_dirname)
os.unlink(os.path.join(local_dirname, "contents"))

# We need an object standing in for TAP services for query generation.
# It would perhaps be nice to pull up a real TAPService instance from
# capabilities and tables, but that's a non-trivial amount of XML.
Expand Down
Loading
Loading