You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@TropComplique i think you are right. taking 'grey_dn' task as an example. i printed the 'if mask is not None:' in forward() of windowattention. it printed 36 times since there 6 RSTB blocks and 6 STL layers in each RSTB. this means that all the window attention of STL are using mask and shifted one.
Here is my preliminary analysis. For 'grey_dn' swinIR is initialized with img_size 128x128. at initialize time, even layer will have "attn_mask = none" and odd layer will have "attn_mask = self.calculate_mask(self.input_resolution)".
However, inference time the precomputed "attn_mask" will be used based on the inference time size. if you inference time image size "x_size" is no longer 128. then you will re-calculate mask for every swin transform layer and every swin tranform layer is using masked attention. It seems the comment in this closed one #13 is not correct.
@JingyunLiang can you please help comment? I am concerned and curious about the results difference between training time and inference time. if train size is 128, then RSTB uses non-shifted attention and shifted&masked attention alternatively at training time. but inference time of a different image size will always use the shifted&masked attention. how would the results match...?
Hi!
I believe that we must use a mask in window self attention only when we shift windows.
But here
SwinIR/models/network_swinir.py
Line 262 in 6545850
we use the mask all the time, for shifted and non shifted windows.
This might introduce errors at the bottom edge or at the right edge of an image.
The text was updated successfully, but these errors were encountered: