-
Notifications
You must be signed in to change notification settings - Fork 278
Feature: EventPreprocessor #2928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
05fe242
add EventPreprocessor
kosack 2567584
added altaz_to_fov helper
kosack 4a553a1
added changelog
kosack e93db12
add alt_az_to_fov to init
kosack f1a15a3
add missing config=True tag
kosack 65d1f01
fix some docstring/type annotation warnings
kosack 6bea57c
Don't use GADF FOV convention by default
kosack 6e97c93
rename function in test too
kosack 87b43dc
fix test after GADF -> Nominal change
kosack 5da834a
fix links in changelog
kosack 096fd64
pass parent to predefined QualityQuery
kosack 3fd0513
remove old comment
kosack fa6de2a
remove unnecessary conversion
kosack fdffce3
fix links in changelog
kosack 88657f5
fix docstring typo and attribute
kosack eba8c25
fix wrong inputs for angular_separation
kosack 1f7bbd8
use a FeatureSetRegistry for FeatureSets
kosack 243ca62
update changelog
kosack 9b32d86
show better example
kosack File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,41 @@ | ||
| Introduces the `~ctapipe.io.EventPreprocessor` class that can generically | ||
| transform an event table by applying the following steps: | ||
|
|
||
| * Generate new or rename existing columns with a `~ctapipe.core.FeatureGenerator` | ||
| * Select "good" event rows with a `~ctapipe.core.QualityQuery` | ||
| * Select which columns to output (by setting the ``features`` configuration | ||
| attribute of the `~ctapipe.io.EventPreprocessor`) | ||
|
|
||
| This is useful for doing the final steps of DL2 processing, and will eventually | ||
| replace what is in `~ctapipe.io.DL2EventPreprocessor` and `~ctapipe.io.DL2EventLoader`, which will be deprecated in a future release. | ||
|
|
||
| The `~ctapipe.io.EventPreprocessor` also includes the ability to pre-configure | ||
| itself for specific use cases by setting the ``feature_set`` option. Currently | ||
| only two are implemented: ``feature_set=dl2_irf``, which defines the transforms, | ||
| event selection, and output features for processing simulated DL2 events, and | ||
| ``feature_set=custom``, which has no pre-configuration and requires all | ||
| parameters to be set by the user in a config file. More can be added by adding | ||
| to the registry. | ||
|
|
||
| The functionality of `~ctapipe.io.DL2EventLoader` can be mimicked with the following: | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| from ctapipe.io import TableLoader, EventPreprocessor | ||
| from astropy.table import vstack | ||
|
|
||
| DL2FILE = "some_dl2_file.h5" | ||
| with TableLoader(DL2FILE, dl2=True, simulated=True, observation_info=True) as loader: | ||
| preprocess = EventPreprocessor(feature_set="dl2_irf") | ||
| events = vstack( | ||
| [ | ||
| preprocess(QTable(c.data)) | ||
| for c in loader.read_subarray_events_chunked(chunk_size=100_000) | ||
| ] | ||
| ) | ||
|
|
||
|
|
||
| This also introduces a helper function `~ctapipe.coordinates.altaz_to_nominal` | ||
| to convert columns of alt/az coordinates to FOV coordinates in the | ||
| `~ctapipe.coordinates.NominalFrame`, which works with the | ||
| `~ctapipe.core.FeatureGenerator`. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,217 @@ | ||
| """Module containing classes related to event loading and preprocessing""" | ||
|
|
||
| from astropy.coordinates import angular_separation | ||
|
|
||
| from ..coordinates import altaz_to_nominal | ||
| from ..core import ( | ||
| Component, | ||
| FeatureGenerator, | ||
| QualityQuery, | ||
| ToolConfigurationError, | ||
| traits, | ||
| ) | ||
|
|
||
| __all__ = ["EventPreprocessor"] | ||
|
|
||
|
|
||
| from typing import Callable | ||
|
|
||
|
|
||
| class FeatureSetRegistry: | ||
| """Registry for custom feature set configurations.""" | ||
|
|
||
| _registry = {} | ||
|
|
||
| @classmethod | ||
| def register(cls, name: str): | ||
| """Register a feature set configuration. | ||
|
|
||
| Examples | ||
| -------- | ||
| >>> @FeatureSetRegistry.register("my_analysis") | ||
| ... def my_config(preprocessor): | ||
| ... return { | ||
| ... "features_to_generate": [("custom", "col_a / col_b")], | ||
| ... "quality_criteria": [("cut", "custom > 0.5")], | ||
| ... "output_features": ["event_id", "custom"] | ||
| ... } | ||
| """ | ||
|
|
||
| def decorator(func: Callable): | ||
| cls._registry[name] = func | ||
| return func | ||
|
|
||
| return decorator | ||
|
|
||
| @classmethod | ||
| def get(cls, name: str): | ||
| """Get a registered configuration function.""" | ||
| return cls._registry.get(name) | ||
|
|
||
| @classmethod | ||
| def list_available(cls): | ||
| """List all registered feature set names.""" | ||
| return list(cls._registry.keys()) | ||
|
|
||
|
|
||
| @FeatureSetRegistry.register("dl2_irf") | ||
| def _dl2_irf_config(preprocessor): | ||
| """Built-in configuration for DL2 IRF generation.""" | ||
| return { | ||
| "features_to_generate": [ | ||
| ("reco_energy", f"{preprocessor.energy_reconstructor}_energy"), | ||
| ("reco_alt", f"{preprocessor.geometry_reconstructor}_alt"), | ||
| ("reco_az", f"{preprocessor.geometry_reconstructor}_az"), | ||
| ("gh_score", f"{preprocessor.gammaness_reconstructor}_prediction"), | ||
| ("theta", "angular_separation(reco_az, reco_alt, true_az, true_alt)"), | ||
| ( | ||
| "reco_fov_coord", | ||
| "altaz_to_nominal(reco_az, reco_alt, subarray_pointing_lon, subarray_pointing_lat)", | ||
| ), | ||
| ( | ||
| "reco_fov_lon", | ||
| "reco_fov_coord[:,0]", | ||
| ), # note: GADF IRFs use the negative of this | ||
| ("reco_fov_lat", "reco_fov_coord[:,1]"), | ||
| ( | ||
| "true_fov_coord", | ||
| "altaz_to_nominal(true_az, true_alt, subarray_pointing_lon, subarray_pointing_lat)", | ||
| ), | ||
| ( | ||
| "true_fov_lon", | ||
| "true_fov_coord[:,0]", | ||
| ), # note: GADF IRFs use the negative of this | ||
| ("true_fov_lat", "true_fov_coord[:,1]"), | ||
| ( | ||
| "true_fov_offset", | ||
| "angular_separation(true_fov_lon, true_fov_lat, 0*u.deg, 0*u.deg)", | ||
| ), | ||
| ( | ||
| "reco_fov_offset", | ||
| "angular_separation(reco_fov_lon, reco_fov_lat, 0*u.deg, 0*u.deg)", | ||
| ), | ||
| ( | ||
| "multiplicity", | ||
| f"np.count_nonzero({preprocessor.gammaness_reconstructor}_telescopes,axis=1)", | ||
| ), | ||
| ], | ||
| "quality_criteria": [ | ||
| ("Valid geometry", f"{preprocessor.geometry_reconstructor}_is_valid"), | ||
| ("valid energy", f"{preprocessor.energy_reconstructor}_is_valid"), | ||
| ("valid gammaness", f"{preprocessor.gammaness_reconstructor}_is_valid"), | ||
| ("sufficient multiplicity", "multiplicity >= 4"), | ||
| ], | ||
| "output_features": [ | ||
| "event_id", | ||
| "obs_id", | ||
| "reco_energy", | ||
| "reco_alt", | ||
| "reco_az", | ||
| "gh_score", | ||
| "true_energy", | ||
| "true_alt", | ||
| "true_az", | ||
| "true_fov_offset", | ||
| "reco_fov_offset", | ||
| "theta", | ||
| "reco_fov_lat", | ||
| "true_fov_lat", | ||
| "reco_fov_lon", | ||
| "true_fov_lon", | ||
| "multiplicity", | ||
| ], | ||
| } | ||
|
|
||
|
|
||
| class EventPreprocessor(Component): | ||
| """ | ||
| Selects or generates features and filters tables of events. | ||
|
|
||
| In normal use, one only has to specify the ``feature_set`` option, which | ||
| will generate features supports standard use cases. For advanced usage, you | ||
| can set ``feature_set=custom`` and pass in a configured | ||
| `~ctapipe.core.FeatureGenerator` and set the ``features`` property of this | ||
| class with the columns you to retain in the output table. | ||
|
|
||
| In the `~ctapipe.core.FeatureGenerator` used internally, you have access to | ||
| several additional functions useful for DL2 processing: | ||
|
|
||
| - `~astropy.coordinates.angular_separation` | ||
| - `~ctapipe.coordinates.altaz_to_nominal` | ||
| """ | ||
|
|
||
| energy_reconstructor = traits.Unicode( | ||
| default_value="RandomForestRegressor", | ||
| help="Prefix of the reco `_energy` column", | ||
| ).tag(config=True) | ||
|
|
||
| geometry_reconstructor = traits.Unicode( | ||
| default_value="HillasReconstructor", | ||
| help="Prefix of the `_alt` and `_az` reco geometry columns", | ||
| ).tag(config=True) | ||
|
|
||
| gammaness_reconstructor = traits.Unicode( | ||
| default_value="RandomForestClassifier", | ||
| help="Prefix of the classifier `_prediction` column", | ||
| ).tag(config=True) | ||
|
|
||
| feature_set = traits.CaselessStrEnum( | ||
| ["custom"] + FeatureSetRegistry.list_available(), | ||
| default_value="custom", | ||
| help=( | ||
| "Set up the FeatureGenerator.features, output features, and quality criteria " | ||
| "based on standard use cases." | ||
| "Specify 'custom' if you want to set your own in your config file. If this is set to " | ||
| "any value other than 'custom', the feature properties of the configuration " | ||
| "file you pass in will be overridden." | ||
| ), | ||
| ).tag(config=True) | ||
|
|
||
| features = traits.List( | ||
| traits.Unicode(), | ||
| help=( | ||
| "Features (columns) to retain in the output. " | ||
| "These can include columns generated by the FeatureGenerator. " | ||
| "If you set these, make sure feature_set=custom." | ||
| ), | ||
| ).tag(config=True) | ||
|
|
||
| def __init__(self, config=None, parent=None, **kwargs): | ||
| super().__init__(config=config, parent=parent, **kwargs) | ||
| if self.feature_set == "custom": | ||
| self.feature_generator = FeatureGenerator(parent=self) | ||
| self.quality_query = QualityQuery(parent=self) | ||
| else: # use a pre-registered feature set | ||
| feature_set = FeatureSetRegistry.get(self.feature_set)(self) | ||
| self.feature_generator = FeatureGenerator( | ||
| parent=self, features=feature_set["features_to_generate"] | ||
| ) | ||
| self.quality_query = QualityQuery( | ||
| parent=self, quality_criteria=feature_set["quality_criteria"] | ||
| ) | ||
| self.features = feature_set["output_features"] | ||
| # sanity checks: | ||
| if len(self.features) == 0: | ||
| raise ToolConfigurationError( | ||
| "DL2EventPreprocessor has no output features configured." | ||
| "You have set `feature_set=custom`, but did not provide the list " | ||
| "of features in the configuration (DL2EventPreprocessor.features)." | ||
| ) | ||
|
|
||
| def __call__(self, table): | ||
| """Return new table with only the columns in features.""" | ||
|
|
||
| # generate new features, which includes renaming columns: | ||
| generated = self.feature_generator( | ||
| table, | ||
| angular_separation=angular_separation, | ||
| altaz_to_nominal=altaz_to_nominal, | ||
| ) | ||
|
|
||
| # apply event selection on the resulting table | ||
|
|
||
| selected_mask = self.quality_query.get_table_mask(generated) | ||
|
|
||
| # return only the columns specified in `self.features`, and rows in | ||
| # `selected_mask` | ||
| return generated[self.features][selected_mask] | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.