Add possibility to produce DL3 with ctapipe#2727
Conversation
I have a question on this point. Could a DL2 have multiple targets, ie different pointing in the same file. Currently everything is thinked a bit more current IACT way, one DL2 file = one run on a specific target |
|
The data format of DL2 allows multiple OBs to be merged, but for CTAO we can probably just assume for that we dont' mix observations in the DL2 produced for observed data. Certainly right now, the GADF format assumes that we do not mix OBs. It will likely be the other way around in fact, we will store multiple DL3 files for a single observation if there are more than one SOI, for example. And right now, also for different event types. My point was just "OB != science target", but you can assume one OB is one pointing, though the pointing could be fixed in ra/dec or alt/az ("drift mode"), since both are supported by ACADA. Again, anything that goes into ctapipe should be as generic as possible (at least should work for any IACT) and not assume exactly what CTAO will doa, and anything ctao-specific should be developed outside ctapipe in a package in the datapipe gitlab space. |
|
The multiple mode of pointing are handled. For multiple OB in the same file, with the current code, it should produce a DL3 file with correct GTI and pointing information, but some information like obs id will not represent everything. I also didn't handle at all the possibility to have different pointing mode in the same file. |
h max to x max conversion is now implemented but not properly tested as the DL2 file I have on hand doesn't have atmosphere profile information (or at least EventSource is not finding the atmosphere profile). |
|
I have converted DL2 files of the performance paper from lstchain to ctapipe format. They can be found on the onsite cluster |
| """ | ||
|
|
||
| # Setting preprocessing for DL3 | ||
| EventPreprocessor.irf_pre_processing = False |
There was a problem hiding this comment.
Setting the class variables here is not how the config system is supposed to work.
You should either override the options, change the defaults by creating a new subclass or just pass required options as keyword arguments when creating the class.
|
|
||
| class DL3EventsWriter(Component): | ||
| """ | ||
| Base class for writing a DL3 file |
There was a problem hiding this comment.
This should probably be an abstract interface class
| self._target_information = None | ||
| self._software_information = None | ||
|
|
||
| @abstractmethod |
There was a problem hiding this comment.
The definitions of all of these getters and setters look like overkill.
It's also seems a bit weird to attach all of these attributes to a data writer. These should be set on the objects, this class writes out to files, not of the writer itself.
| self._obs_id = obs_id | ||
|
|
||
| @property | ||
| def events(self) -> QTable: |
There was a problem hiding this comment.
I would expect a DL3 writer API to look something like this:
@abtractmethod
def __call__(self, path, events: DL3Events, irf: IRF, metadata: DL3Metadata, ...):
...
and it writes the DL3 file to path.
The writer should't have attributes related to the data itself.
The purpose of this PR is to add support for the creation of DL3 file in ctapipe. The current output format is the GADF format as described in : https://gamma-astro-data-formats.readthedocs.io/en/v0.3/
The modification include several change in some part of the code used for IRFs production in order to make it compatible also for DL3 production (loading events and applying cuts).
This PR should be for now considered as a draft as several item are missing :
The objectives to first submit it as a draft is to be able to discuss several points :
Handling of time
It's not very clear to me the current time format in the DL2, and so if all the conversion performed are in line with what should be done.
Also what is the best time scale to use for our case, TAI, UTC ?
What is the reference time that should be used ? It is currently set in the code to UNIX time, but maybe we want to have a CTA dedicated one like other experience are doing.
Optional columns for events
There are currently support for most of the optional columns defined in the GADF format (https://gamma-astro-data-formats.readthedocs.io/en/v0.3/events/events.html). The two exceptions are x max and hillas parameters.
For x max, I instead currently export h max. Are there any simple library to convert h max into x max ?
For hillas parameters, as the intended use is mainly stereo, it was not obvious which one to add to the file and currently skipped all of them.
Metadata
For numerous metadata, i didn't find information about them in the DL2 file, but it could come partly due to currently using MC DL2 file :
Data quality metadata
In the optional metadata of the GADF, there are quite a few linked to quality (trigger rate, broken pixel, muon efficiency, humidity, NSB, ....). I guess than for CTA we would like to handle quality a bit differently. Should they be included any way. If yes, how do I retrieve all those information.
Code organization and implementation
I'm not yet used to ctapipe specificity (tools and component). I would like to validate, my use of them is corresponding to the intent. Also I've currently put the code for DL3 production mainly in the irf folder as a very large fraction is common. Should we rename it or move it ?
Speed
Currently the code is crazy slow (It took close to 30 minutes on my laptop to process a single gamma MC DL2 file). I've encountered some issue when I tried to profile it (any help here is welcome) but I guess most of it come from coordinate conversions. How important is this for the first version ?