Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
raw
history blame contribute delete
135 Bytes
Longformer and reformer are models that try to be more efficient and
use a sparse version of the attention matrix to speed up training.