Spaces:

Ahmadzei
/

RAG

Runtime error

RAG

File size: 202 Bytes

5fa1a76

This is shown in Figure 2d of the paper, see below for a sample attention mask:

Using those attention matrices with less parameters then allows the model to have inputs having a bigger sequence
length.