Skip to content

Commit

Permalink
Merge pull request #92 from LuCEresearchlab/minor-patch
Browse files Browse the repository at this point in the history
Added safety for dataset uploading
  • Loading branch information
malags authored Apr 25, 2021
2 parents 5587663 + 36e6da8 commit 0fbb419
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 3 deletions.
11 changes: 9 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,8 +35,15 @@ In case there are issues due to dependencies try to rebuild the containers (will
docker-compose up --build # rebuild
```

Note: The clustering might take a while, the clusters tend to finish close to each other don't panic if you see no
progress for several minutes.
Notes:

- The clustering might take a while, the clusters tend to finish close to each other don't panic if you see no progress
for several minutes.
- If the tagging-service is interrupted before completing the clustering it'll be necessary to manually log into the
database and delete the object under `dataset_db/dataset` with `dataset_id` equal to the one that was interrupted.

If the entry is not deleted it'll be impossible to complete the clustering for it and the dataset will always result
as loading.

Optimizations: It's possible to change the resources allocated to the python-service from `.env `, this will impact
clustering performance, to change the resources modify the environment variables:
Expand Down
2 changes: 1 addition & 1 deletion frontend/src/pages/tagging/DatasetSelection.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ function DatasetSelection() {
<TableBody>
{datasets.map((dataset: DatasetDesc) => {
const loading_cluster = dataset.clusters_computed != dataset.nr_questions
const needed_time_s = 1000 * 60 * 2 * dataset.nr_questions
const needed_time_s = 1000 * 90 * dataset.nr_questions
const started = new Date(dataset.creation_data)
const now = new Date()

Expand Down
4 changes: 4 additions & 0 deletions tagging-service/flaskr/endpoints/upload_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,10 @@ def thread_function(dataset):

json_dataset = json.loads(uploaded_file.read())

dataset_from_db = get_dataset(dataset_id=json_dataset['dataset_id'])
if dataset_from_db is not None and dataset_from_db['clusters_computed'] != len(dataset_from_db['questions']):
return f'rejected file: {uploaded_file.name}, dataset still uploading'

Thread(target=thread_function, args=(json_dataset,)).start()

return f'uploaded file: {uploaded_file.name} successfully'
Expand Down

0 comments on commit 0fbb419

Please sign in to comment.