-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modify unit test and validation #63
Conversation
This will be reviewed after #61 |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
d8e71a6
to
6146d5a
Compare
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
@jburel I have deployed this branch on Idr-testing (merged locally with the rocky Linux branch). Could you please have a look? I think this one has been around for a while, it improved the test and added a test for indexing. |
A heads up, @khaledk2, that the 1.7 GB of |
omero_search_engine/api/v1/resources/schemas/filter_schema.json
Outdated
Show resolved
Hide resolved
Co-authored-by: jean-marie burel <[email protected]>
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Thanks, @khaledk2. GH diff now renders! 🎉 |
.github/workflows/main.yml
Outdated
wget https://downloads.openmicroscopy.org/images/omero_db_searchengine.zip -P app_data | ||
unzip app_data/omero_db_searchengine.zip -d app_data/ | ||
#cat app_data/backup_parts/backupa? > app_data/omero.pgdump | ||
#run restore omero database |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#run restore omero database | |
# run restore omero database |
unit_tests/test_app.py
Outdated
value = case[1] | ||
validator = Validator(deep_check) | ||
validator.set_simple_query(resource, name, value) | ||
validator.get_results_postgres("equals") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to be adjusted after method renaming
Co-authored-by: jean-marie burel <[email protected]>
…edk2/omero_search_engine into modify_unit_test_validation
for more information, see https://pre-commit.ci
manage.py
Outdated
@@ -148,7 +160,8 @@ def get_index_data_from_database(resource="all"): | |||
test_indexing_search_query(deep_check=False, check_studies=True) | |||
|
|||
# backup the index data | |||
backup_elasticsearch_data() | |||
if not nobackup: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double negative is a bit strange
should the variable be called backup
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is possible to fix this double negative?
manage.py
Outdated
It also checks the key-value pair duplication. | ||
It can check all the projects and screens. | ||
Also, it can run for a specific project or screen. | ||
The output is a collection of CSV files; each check usually generates three files: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The output is a collection of CSV files; each check usually generates three files: | |
The output is a collection of CSV files; each check usually generates three files: | |
omero_search_engine/validation/omero_keyvalue_data_validator.py
Outdated
Show resolved
Hide resolved
df.groupby(["screen_name", "name", "value"]) | ||
.size() | ||
.reset_index() | ||
.rename(columns={0: "no of images"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.rename(columns={0: "no of images"}) | |
.rename(columns={0: "number of images"}) |
omero_search_engine/validation/omero_keyvalue_data_validator.py
Outdated
Show resolved
Hide resolved
manage.py
Outdated
@@ -148,7 +160,8 @@ def get_index_data_from_database(resource="all"): | |||
test_indexing_search_query(deep_check=False, check_studies=True) | |||
|
|||
# backup the index data | |||
backup_elasticsearch_data() | |||
if not nobackup: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is possible to fix this double negative?
|
||
def restore_database(): | ||
""" | ||
restote the database from a database dump file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This typo suggestion was missed
) | ||
tail_space_results = conn.execute_query(sql_statment) | ||
if len(tail_space_results) == 0: | ||
search_omero_app.logger.info("No results is availlable for trailing space") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
search_omero_app.logger.info("No results is availlable for trailing space") | |
search_omero_app.logger.info("No results available for trailing space") |
search_omero_app.logger.info("No results is availlable for trailing space") | ||
return | ||
search_omero_app.logger.info("Generate for trailing space ...") | ||
genrate_reports(tail_space_results, "tailing_space", screen_name, project_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
genrate_reports(tail_space_results, "tailing_space", screen_name, project_name) | |
genrate_reports(tail_space_results, "trailing_space", screen_name, project_name) |
|
||
|
||
def check_duplicated_keyvalue_pairs(screen_name, project_name): | ||
search_omero_app.logger.info("Checking for duplicated key-value pairs ...") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
search_omero_app.logger.info("Checking for duplicated key-value pairs ...") | |
search_omero_app.logger.info("Checking for duplicated key-value pairs...") |
if not os.path.isdir(base_folder): | ||
base_folder = os.path.expanduser("~") | ||
if write_report: | ||
base_folder = "/etc/searchengine/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The location of the base folder is hard coded in a few places.
I think we need to use a configuration variable
unit_tests/test_app.py
Outdated
query_in, | ||
images_keys, | ||
images_value_parts, | ||
contains_not_contains_quries, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
contains_not_contains_quries, | |
contains_not_contains_queries, | |
unit_tests/test_app.py
Outdated
validator.searchengine_results.get("total_number_of_buckets"), | ||
) | ||
|
||
def test_contains_not_contains_quries(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def test_contains_not_contains_quries(self): | |
def test_contains_not_contains_queries(self): | |
unit_tests/test_app.py
Outdated
) | ||
|
||
def test_contains_not_contains_quries(self): | ||
for resource, cases in contains_not_contains_quries.items(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for resource, cases in contains_not_contains_quries.items(): | |
for resource, cases in contains_not_contains_queries.items(): | |
for more information, see https://pre-commit.ci
…edk2/omero_search_engine into modify_unit_test_validation
Running
leads to some syntax error
|
|
Syntax error in the bash script |
Please use this image which contains the changes for this PR |
Using
|
Discussed with @khaledk2 yesterday, the warnings in the bash script are not introduced by this PR. |
This PR adds restoring the database from a file and indexing the database to the GitHub actions.
It will restore an Omero database from backup
Indexing the data
Run the unit tests, I have added some queries to validate the results it will test:
This PR depended on PR #59.
This PR includes also a method to check key-value pairs for tailing and heading space. It also checks the key-value pair duplication, i.e. the image has more than one identical key with the same value.
It can check all the projects and screens. Also, it can run for a specific project or screen.
The output is CSV files; each check usually generates three files. The main file contains image details (e.g. image id) the screen or project name, the key and the value. File for screens and one for projects. Each file contains the screen name (project name), the key-value which has the issue and the total number of affected images for each row.
The search engine saves the output files at the
/data/searchengine/searchengine/
folder .Examples:
This command will check the database for the projects whose names contains
idr0118-keenan-flylightsheet
:sudo docker run --rm -v /data/searchengine/searchengine/:/etc/searchengine/ -v /data/searchengine/searchengine/logs/:/opt/app-root/src/logs/ --network=searchengine-net khaledk2/searchengine:latest data_validator -p idr0140-ho-stressresponse -p idr0118-keenan-flylightsheet
The following command will check screens their name contain
idr0093-mueller-perturbation
sudo docker run --rm -v /data/searchengine/searchengine/:/etc/searchengine/ -v /data/searchengine/searchengine/logs/:/opt/app-root/src/logs/ --network=searchengine-net khaledk2/searchengine:latest data_validator -s idr0093-mueller-perturbation
This command is another command to check the database for the projects whose names contains
idr0043-uhlen-humanproteinatlas
:sudo docker run --rm -v /data/searchengine/searchengine/:/etc/searchengine/ -v /data/searchengine/searchengine/logs/:/opt/app-root/src/logs/ --network=searchengine-net khaledk2/searchengine:latest data_validator -p idr0043-uhlen-humanproteinatlas
Finally, the following command will check all the projects and screens inside the database:
sudo docker run --rm -v /data/searchengine/searchengine/:/etc/searchengine/ -v /data/searchengine/searchengine/logs/:/opt/app-root/src/logs/ --network=searchengine-net khaledk2/searchengine:latest data_validator