Purpose of replace_with_xformers_attention() function #115

cramraj8 · 2024-04-19T14:30:22Z

I wonder what's the purpose of having replace_with_xformers_attention() defined in the utils.py because I am getting the following error,

AttributeError: 'LlamaAttention' object has no attribute 'num_key_value_heads'

Does the self.num_key_value_heads value in the replace_with_xformers_attention() defined somewhere else ?

The text was updated successfully, but these errors were encountered:

MXueguang · 2024-04-27T20:36:08Z

I was trying to use flashattention with replace_with_xformers_attention(). but with recent transformers, i believe LLaMA can direct use flashattention by specificing atten_implementation when loading the pretrained model. this line is not necessary any more.

cramraj8 · 2024-04-28T13:19:36Z

Got it. Thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Purpose of replace_with_xformers_attention() function #115

Purpose of replace_with_xformers_attention() function #115

cramraj8 commented Apr 19, 2024 •

edited

Loading

MXueguang commented Apr 27, 2024

cramraj8 commented Apr 28, 2024

Purpose of replace_with_xformers_attention() function #115

Purpose of replace_with_xformers_attention() function #115

Comments

cramraj8 commented Apr 19, 2024 • edited Loading

MXueguang commented Apr 27, 2024

cramraj8 commented Apr 28, 2024

cramraj8 commented Apr 19, 2024 •

edited

Loading