Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design and implement Conditions Data access in Falaise #90

Open
drbenmorgan opened this issue Mar 19, 2018 · 15 comments
Open

Design and implement Conditions Data access in Falaise #90

drbenmorgan opened this issue Mar 19, 2018 · 15 comments

Comments

@drbenmorgan
Copy link
Member

Kicked off by @emchauve 's mail to the Software list:

Let me open this important thread related to database access in Falaise.

On calorimeter side, we are getting ready to test and prototype the usage of the database in Falaise >modules. Examples in hand are : usage of energy resolution proper to each OM in simulation ; >measure and store calibration constants per OMs ; retrieve these constants and propagate them at >calibration stage. After discussion in the Analysis Board, there was a consensus that calorimeter >would be a good "guinea pig" to put in place such database service!

Was there already any thought on the topic and on the approach we want ? Or perhaps developments >already done ?

One important point is to consider if/when we need an offline functionality of the service. By offline, I >mean getting some informations without internet access to DB servers @ CCIN2P3. It will clearly >depend of the usage and amount of the data itself. Taking the case of energy resolution per OM (520 >parameters) : it is needed for baseline simulation and offline access can be justify as these >parameters are going to be static. A simple properties.conf file of 520 lines read by the DB service is >feasible (These parameters were measured once during production ; in the worse case, we might >have another measurement at LSM with 207Bi electrons, therefore one upgrade to >properties_v2.conf ). This offline approach is going to be tough (but not impossible( for most of time >dependant parameters, like PMT gain or tracker time, which will change every week or month for >individual OM or cells.

So.... what should we do?! :)

@drbenmorgan
Copy link
Member Author

Some initial considerations:

  • Basic interface, i.e. how should modules access DB info? This completely independent of the underlying connection/C++ implementation.
  • datatools::properties would not be a suitable technology because it won't scale.
  • Should investigate existing "cache database results for offline usage" technologies as these should be common (e.g. web browsers caching pages).

@drbenmorgan drbenmorgan changed the title Design and implement database access in Falaise Design and implement Conditions Data access in Falaise Jan 30, 2019
@drbenmorgan drbenmorgan assigned robobre and pfranchini and unassigned fmauger Aug 5, 2019
@drbenmorgan
Copy link
Member Author

Adding additional people involved in the discussion.

@pfranchini
Copy link
Contributor

Has any other discussion happened outside of this issue's thread?

@drbenmorgan
Copy link
Member Author

I'll just mention PR #154 here to cross-reference it. A service likely forms the basis of accessing Conditions Data inside the pipeline.

@pfranchini, @robobre will be setting up a meeting next week I think for update/discussion. I'll forward/cc you the details when they're out.

@emchauve
Copy link
Member

I am finally into this issue! (branch add-database-service on my Falaise's fork). I have written the base for the database manager and the database service. And working now on DB connexion.

I am suffering 2 issues due to lack of expertise on brew and cmake, @drbenmorgan you might be able to guide me :

-- I am using MySQL++ which provides a C++ wrapper for mysql-client, installed with brew for which [email protected] is required. Bayeux is also relying on openssl but default version (1.0) and I suspect I am having conflicts at runtime with that. How to handle that ? Should I switch all Bayeux dependencies to [email protected] ? Or is there a solution tor work with both with brew ?

-- How can I add the new dependencies for Falaise (MySQL++ and mysql-client) within CMakeLists.txt in a non-dirty way ? knowing that mysql-client is installed in $BREW/opt/mysql-client without find cmake. There is at least a pkgconfig script for mysql-client, but nothing for mysql++ !

Thanks for your help

@drbenmorgan
Copy link
Member Author

-- I am using MySQL++ which provides a C++ wrapper for mysql-client, installed with brew for which [email protected] is required. Bayeux is also relying on openssl but default version (1.0) and I suspect I am having conflicts at runtime with that. How to handle that ? Should I switch all Bayeux dependencies to [email protected] ? Or is there a solution tor work with both with brew ?

Whilst it's a C-only API (but a good one), could mariadb-c-connector be used instead? It's a much lighter library than mysql plus another lib on top of that.

-- How can I add the new dependencies for Falaise (MySQL++ and mysql-client) within CMakeLists.txt in a non-dirty way ? knowing that mysql-client is installed in $BREW/opt/mysql-client without find cmake. There is at least a pkgconfig script for mysql-client, but nothing for mysql++ !

If they have pkgconfig (.pc) files then CMake's hook to pkgconfig can be used. There's an example of use here:

https://gitlab.cern.ch/lhcb/GitCondDB/blob/master/CMakeLists.txt#L46

and linking here:

https://gitlab.cern.ch/lhcb/GitCondDB/blob/master/CMakeLists.txt#L127

It should be a case of doing:

find_package(PkgConfig)
pkg_check_modules(MYNAME mysql-client REQUIRED IMPORTED_TARGET)
...
target_link_libraries(DBService PRIVATE PkgConfig::MYNAME)

@emchauve
Copy link
Member

Thanks for the suggestion. I will investigate mariadb-c-connecter, however it still requires [email protected] from its formula !

@fmauger
Copy link
Contributor

fmauger commented Jan 22, 2020

Why should we use MySQL or MariaDB ?
The AMI group @ LPSC is supposed to provide us a C++ API to access the SN database system independently of the underlying techno.
With the approach you are initiating, you freeze SN code with a given technology and face immediately implementation details rather than considering the problem with some perspective:

  • What is the purpose of the DB access system ?
  • What are the use cases you want to consider first ?
  • What datamodels should be implemented to address calibration and characterization of many detection units ?
  • What data are stable, what data are updated regularly and how this patterns the user interface ?
    This has nothing to do with DB tables and their internal details (and package management issues).
    This topic has to be investigated with respect to what we want to do with these data (input for some calib/reconstruction algos...). In a first step, we could perfectly design this CondDataAccess interface
    using plain ASCII files and explore the best way to design the API. Then, after we have considered a significan set of usecases, we could consider to use a specific scalable techno (DB?) for production.

@fmauger
Copy link
Contributor

fmauger commented Jan 22, 2020

The title of this issue proposed by drbenmorgan is :
"Design and implement Conditions Data access in Falaise"
not:
"Implement a DB system in Falaise".

@emchauve
Copy link
Member

The title was modified, but initial topic was indeed implementing DB access in Falaise!
@drbenmorgan Would it be possible to add the formula for building C++ AMI API librariry in our Homebrew ? (git repo is: https://github.com/ami-team/cami)

@emchauve
Copy link
Member

  • What are the use cases you want to consider first ?

Most simple case: energy calibration of calorimeter OMs with 1 parameter (Energy = a x Charge)

  • What datamodels should be implemented to address calibration and characterization of many detection units ?

We are working on it in parallel, the idea and change from current model is to be able to handle different version of charge and energy (e.g. in a vector) computed/calibrated with different methods. Such dynamic data models would not require modification of members, but just addition of enum for indexing the new version. I hope that make sense ?

  • What data are stable, what data are updated regularly and how this patterns the user interface ?

This question will happen for all data to be store indeed, but I am not sure to understand the point because we need need anyway the interface to get both stable data and update-able data ?

@drbenmorgan
Copy link
Member Author

The title was modified, but initial topic was indeed implementing DB access in Falaise!
@drbenmorgan Would it be possible to add the formula for building C++ AMI API librariry in our Homebrew ? (git repo is: https://github.com/ami-team/cami)

As far as I am aware, AMI is not the conditions database! It isn't in ATLAS, see this paper, and this one, especially 2.2.

With the approach you are initiating, you freeze SN code with a given technology and face immediately implementation details rather than considering the problem with some perspective:

Yes and no. I agree that the fundamental issue is the client API, so that can and should be mocked in with what we know to date, e.g.

class CondDBService {
  ... what member functions do users of the service need ...

  # This is probably one of them
  OMParameter getOMParameter(OMID x, IOV i) const;
};

What goes on in the implementation will always be technology dependent, but it is effectively defined for us as SQL (by CC-Lyon), though the LHCb GitCondDB remains an option (and likely will be used for geometry etc). I'm therefore not adverse to the use of SQL libraries at this stage modulo that they are only used as an implementation detail.

@emchauve one thought, could you use the SQLite library for prototyping? It's very simple, similar API to MySQL/MariaDB, and as it's file based can be used offline.

@emchauve
Copy link
Member

In fact, the AMI client API is really ultra light, few 100 lines of codes (https://github.com/ami-team/cami/) and the admin web interface is very convenient to handle a common set of users and privileges over different DBs.

There is few different output format provided by the server : text, CSV, JSON or XML. You can give a try there with GetSessionInfo command (the only command available for guest user) : https://ami-supernemo.in2p3.fr/app/?subapp=command

The most interesting output format provided by the server would be JSON I guess (?) for which we will need a parser. Do you have feedback on it and suggestions ?

@drbenmorgan
Copy link
Member Author

For JSON parser, easiest is probably nlohmann-json

Nevertheless, why would we use AMI to access (from Falaise), the CondDB? Could we get confirmation from the AMI developers that this is how it's used in ATLAS to access (from Athena, their Falaise equivalent) actual conditions from the Oracle/COOL/SQLite DBs? It just feels awkward and inefficient to use a web API that will ultimately just query the DB at Lyon.

@drbenmorgan drbenmorgan unpinned this issue Apr 3, 2020
@fmauger
Copy link
Contributor

fmauger commented Mar 25, 2022

TODO: specifications for:

  • condtition DB: datamodel for Geiger cells' and optical modules' status during specific time periods
  • calibration: what calibration parameters are needed ? what software interface is implied by the calibration procedure ?
    Towards a service that links the informations from the database and the reconstruction algos

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants