The foundation this model
is the RoBERTa-style model
deepset/gbert-large.
Following Gururangan et al. (2020)
we gathered a collection of narrative fiction and
continued the models pre-training task with it.
The training is performed over 10 epochs on 2.3 GB of
text with a learning rate of 0.0001
(linear decrease) and a batch size of 512.
- Downloads last month
- 144
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.