Training with transformers API
Hi,
in my experiments, I was able to train ModernBERT by relying on the Transformers library (specifically, using a modified run_mlm.py script from the transformers github: https://github.com/huggingface/transformers/blob/v4.49.0/examples/pytorch/language-modeling/run_mlm.py).
But if I understand correctly, this approach does not utilize the local/global attention ModernBERT implements?
Hey! could you share your colab ?
Hi,
in my experiments, I was able to train ModernBERT by relying on the Transformers library (specifically, using a modified run_mlm.py script from the transformers github: https://github.com/huggingface/transformers/blob/v4.49.0/examples/pytorch/language-modeling/run_mlm.py).
But if I understand correctly, this approach does not utilize the local/global attention ModernBERT implements?
It uses RoPE.