Fabrice-TIERCELIN's picture
self_attn_mask_1 = attention_mask.view(batch_size, 1, 1, seq_len).expand(-1, -1, seq_len, -1)
0b24ac6 verified