Skip to content

Commit

Permalink
Readme
Browse files Browse the repository at this point in the history
  • Loading branch information
jacobgil committed Aug 18, 2024
1 parent da83f1f commit 6b8c611
Showing 1 changed file with 5 additions and 5 deletions.
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ The aim is also to serve as a benchmark of algorithms and metrics for research o

----------
# Choosing the Target Layer
You need to choose the target layer to compute CAM for.
You need to choose the target layer to compute the CAM for.
Some common choices are:
- FasterRCNN: model.backbone
- Resnet18 and 50: model.layer4[-1]
Expand Down Expand Up @@ -143,7 +143,7 @@ with GradCAM(model=model, target_layers=target_layers) as cam:
model_outputs = cam.outputs
```

Cam.py has a more detailed usage example.
cam.py has a more detailed usage example.

----------

Expand Down Expand Up @@ -176,7 +176,7 @@ scores = cam_metric(input_tensor, grayscale_cams, targets, model)
----------


# Advanced use cases and tutorials:
# Adapting for new architectures and tasks

Methods like GradCAM were designed for and were originally mostly applied on classification models,
and specifically CNN classification models.
Expand All @@ -186,15 +186,15 @@ The be able to adapt to non standard cases, we have two concepts.
- The reshape transform - how do we convert activations to represent spatial images ?
- The model targets - What exactly should the explainability method try to explain ?

## The reshape transform
## The reshape_transform argument
In a CNN the intermediate activations in the model are a mult-channel image that have the dimensions channel x rows x cols,
and the various explainabiltiy methods work with these to produce a new image.

In case of another architecture, like the Vision Transformer, the shape might be different, like (rows x cols + 1) x channels, or something else.
The reshape transform converts the activations back into a multi-channel image, for example by removing the class token in a vision transformer.
For examples, check [here](https://github.com/jacobgil/pytorch-grad-cam/blob/master/pytorch_grad_cam/utils/reshape_transforms.py)

## Model Targets
## The model_target argument
The model target is just a callable that is able to get the model output, and filter it out for the specific scalar output we want to explain.

For classification tasks, the model target will typically be the output from a specific category.
Expand Down

0 comments on commit 6b8c611

Please sign in to comment.