The bug in getting attention weight #302

SjokerLily · 2024-06-20T20:33:09Z

I try to get the attention weight of the model like this:
outputs = self.model( vision_x=vision_x, lang_x=lang_x, attention_mask=attention_mask, clear_conditioned_layers=clear_conditioned_layers, past_key_values=past_key_values, use_cache=(past_key_values is not None), output_attentions=True, )
However, the attention weight tuple it returns is tuple of None.
I step into the code and find out it might be a bug in MPT codes in "huggingface/modules/transformers_modules/". The parameter output_attentions has been omitted during the calling of function MPTBlock. forward() in blocks.py.
I try to fix this bug but when running it, the code returns back to its original version.
Is there any solution to it?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The bug in getting attention weight #302

The bug in getting attention weight #302

SjokerLily commented Jun 20, 2024

The bug in getting attention weight #302

The bug in getting attention weight #302

Comments

SjokerLily commented Jun 20, 2024