YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Ablation whether beta2=0.95
is better than beta2=0.999
. The answer is yes, beta2=0.95 is more stable and leads to slightly lower loss as seen by comparing the tensorboard of the this model with the normal lm1-8b7-12b.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support