Made inference modality agnostic in re-ranking and other parts of the repo #542

AlekseySh · 2024-04-19T09:19:44Z

Changelog (all the functions and classes on the right side are modality agnostic):

EmbeddingPairsDataset, ImagePairsDataset -> PairDataset
pairwise_inference_on_images, pairwise_inference_on_embeddings -> pairwise_inference
IDistancesPostprocessor -> (mostly renamed) -> IRetrievalPostprocessor
PairwisePostprocessor, PairwiseEmbeddingsPostprocessor, PairwiseImagesPostprocessor -> PairwiseReranker
inference_on_images -> inference
inference_on_dataframe -> inference_cached

Also:

EmbeddingMetrics takes optional dataset argument in order to perform postprocessing.
Made postprocessing tests a bit more informative via making dummy models a bit less trivial (added bias to their outputs)

Examples changed:

train + val and prediction for postprocessor
retrieval usage
- added global_paths parameter to download_mock_dataset so it looks nicer

AlekseySh · 2024-04-21T00:01:03Z

tests/test_oml/test_postprocessor/test_pairwise_embeddings.py

-        return self.distances_to_return
-
-
-@pytest.mark.long


I removed this test because it was too tricky and it's hard to support it when changing interfaces.

oml/datasets/pairs.py

oml/inference/abstract.py

oml/lightning/pipelines/train_postprocessor.py

DaloroAT · 2024-04-21T10:59:07Z

oml/retrieval/postprocessors/pairwise.py

-        return distances_upd
+        distances_top = distances_top.view(distances.shape[0], top_n)
+
+        distances_upd, ii_rerank = distances_top.sort()


This inconsistent function for ii_rerank... 😁

what do u mean?)

I've added
# todo 522: explain what's going on here

so, when all interfaces are settled i will add more explanations there

I've added examples, I hope it helps

Sry, it's just sort that returns random indices for the same values, like in metrics

ooh, got it, you are right
the same problem
I hope we will solve it at some time...

…onfusing test

AlekseySh · 2024-04-22T16:43:05Z

REPRODUCING OUR POSTPROCESSING PAPER:

InShop validation with pp:

     Validate metric           DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
      OVERALL/cmc/1         0.9480938911437988
     OVERALL/cmc/10         0.9850190281867981
     OVERALL/cmc/100        0.9967646598815918
     OVERALL/cmc/20          0.99057537317276
     OVERALL/cmc/30         0.9929666519165039
     OVERALL/map/10          0.910080075263977
      OVERALL/map/5         0.9475616216659546

SOP:

    Validate metric           DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
      OVERALL/cmc/1         0.8812931776046753
     OVERALL/cmc/10         0.9525800943374634
     OVERALL/cmc/100        0.9812898635864258
     OVERALL/cmc/20         0.9645466208457947
     OVERALL/cmc/30         0.9699679613113403
     OVERALL/map/10          0.865265429019928
      OVERALL/map/5         0.8939490914344788

I've used models we trained before:

import gdown

gdown.download(id='13Y6BWkj7Y9fwTcD3ON1_hdqzKAmUCl23', quiet=False)
gdown.download(id='13ixeiusxYOYNfQ1nslWbPRRO0YMX1Dv4', quiet=False) 
gdown.download(id='1L8TmHToKZJiogmEZWNExU0dzI8jh_JIk', quiet=False) 
gdown.download(id='1KBJwqIaa39foEqn271_GSkZOda1lE9Wp', quiet=False)

AlekseySh · 2024-04-22T17:04:56Z

tests/test_runs/test_pipelines/configs/train_postprocessor.yaml

@@ -1,7 +1,7 @@
 postfix: "postprocessing"

 seed: 42
-precision: 16
+precision: 32


i checked that there is no precision 16 for cpu, so, that value was confusing

AlekseySh · 2024-04-22T17:05:44Z

oml/utils/misc_torch.py

@@ -72,7 +72,7 @@ def cat_two_sorted_tensors_and_keep_it_sorted(x1: Tensor, x2: Tensor, eps: float
    assert eps >= 0
    assert x1.shape[0] == x2.shape[0]

-    scale = (x2[:, 0] / x1[:, -1]).view(-1, 1)
+    scale = (x2[:, 0] / x1[:, -1]).view(-1, 1).type_as(x1)


error apeared in half precision, so i added type_as

AlekseySh · 2024-04-22T17:49:17Z

tests/test_runs/test_pipelines/predict.py

@@ -11,7 +11,7 @@
 def main_hydra(cfg: DictConfig) -> None:
    cfg = dictconfig_to_dict(cfg)
    download_mock_dataset(MOCK_DATASET_PATH)
-    cfg["data_dir"] = MOCK_DATASET_PATH
+    cfg["data_dir"] = str(MOCK_DATASET_PATH)


the idea here and in other similar changes below is to have the same types as in real runs of our pipelines

AlekseySh · 2024-04-25T06:54:10Z

OTHER CHECKS:

running predict in pipelines on InShop
running vanilla training on SOP
running postprocessor training on InShop, see below

AlekseySh added 8 commits April 19, 2024 08:06

reworking datasets

7b8e86c

removed ListDataset

cf3f1db

upd

d2e8215

minor

ad3fe93

fix

6919668

upd

e75f991

update_naming

4d381ed

simplified postprocessing and inference

08d7048

AlekseySh added the rework label Apr 19, 2024

AlekseySh self-assigned this Apr 19, 2024

AlekseySh changed the base branch from main to refactor_datasets April 19, 2024 09:20

upd

05b8144

AlekseySh linked an issue Apr 19, 2024 that may be closed by this pull request

[EPIC] Release OML 3.0 #522

Closed

AlekseySh changed the title ~~Made inference modality agnostic in re-ranking and other parts of the repo~~ [WIP] Made inference modality agnostic in re-ranking and other parts of the repo Apr 19, 2024

AlekseySh added 4 commits April 20, 2024 17:22

upd

5ebd696

upd

33f11ae

upd

db2d7e4

merge

997c88b

AlekseySh changed the base branch from refactor_datasets to main April 20, 2024 23:54

AlekseySh commented Apr 21, 2024

View reviewed changes

AlekseySh requested a review from DaloroAT April 21, 2024 00:10

minor

bb21628

AlekseySh changed the title ~~[WIP] Made inference modality agnostic in re-ranking and other parts of the repo~~ Made inference modality agnostic in re-ranking and other parts of the repo Apr 21, 2024

Fissium reviewed Apr 21, 2024

View reviewed changes

oml/datasets/pairs.py Outdated Show resolved Hide resolved

DaloroAT reviewed Apr 21, 2024

View reviewed changes

AlekseySh added 5 commits April 21, 2024 22:28

upd

548de97

addressed comments and introduced IIndexedDataset

6212be9

updated examples

2736922

update

aedec8b

upd

5e38ea0

AlekseySh requested review from Fissium and DaloroAT April 22, 2024 00:07

AlekseySh added 5 commits April 22, 2024 07:30

upd

daf0c47

upd

19a6ed2

put_back_test

06864e2

fixes: type of dataset root, model link in validation

b54b6a1

optimizer postprocessing; solved issue with half precision; removed c…

b9d02a6

…onfusing test

minor: hotfix none postproc + raise error if no images in predict

f2be9b2

AlekseySh commented Apr 22, 2024

View reviewed changes

AlekseySh added 2 commits April 25, 2024 14:39

merge and update test

3df8de4

merge

b4c8b67

AlekseySh changed the base branch from main to oml_3.0 April 28, 2024 07:08

AlekseySh merged commit 8f7023a into oml_3.0 Apr 28, 2024
8 checks passed

This was referenced Jun 6, 2024

[EPIC] Release OML 3.0 #522

Closed

Changes for OML 3.0 #557

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Made inference modality agnostic in re-ranking and other parts of the repo #542

Made inference modality agnostic in re-ranking and other parts of the repo #542

AlekseySh commented Apr 19, 2024 •

edited

Loading

AlekseySh Apr 21, 2024 •

edited

Loading

DaloroAT Apr 21, 2024

AlekseySh Apr 21, 2024

AlekseySh Apr 21, 2024

AlekseySh Apr 21, 2024

DaloroAT Apr 24, 2024

AlekseySh Apr 24, 2024

AlekseySh commented Apr 22, 2024 •

edited

Loading

AlekseySh Apr 22, 2024

AlekseySh Apr 22, 2024

AlekseySh Apr 22, 2024

AlekseySh commented Apr 25, 2024 •

edited

Loading

Made inference modality agnostic in re-ranking and other parts of the repo #542

Made inference modality agnostic in re-ranking and other parts of the repo #542

Conversation

AlekseySh commented Apr 19, 2024 • edited Loading

AlekseySh Apr 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlekseySh commented Apr 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlekseySh commented Apr 25, 2024 • edited Loading

AlekseySh commented Apr 19, 2024 •

edited

Loading

AlekseySh Apr 21, 2024 •

edited

Loading

AlekseySh commented Apr 22, 2024 •

edited

Loading

AlekseySh commented Apr 25, 2024 •

edited

Loading