PhilipQuirke
/

QuantaMaths_mix_d12_l3_h4_t85K_s572091

PhilipQuirke commited on Jan 8

Commit

e52d835

verified ·

1 Parent(s): c832bfb

Update model card with revised readme

Files changed (1) hide show

README.md CHANGED Viewed

@@ -9,6 +9,7 @@ This repository contains a transformer model that can predict both addition and
 - **Attention Heads**: 4
 - **Training steps**: 85,000
 - **Random seed**: 572091
 **Contents**:
 - `model.pth`: The trained transformer model.
 - `training_loss.json`: Data gathered during model training (used to plot "loss over training batches").

 - **Attention Heads**: 4
 - **Training steps**: 85,000
 - **Random seed**: 572091
 **Contents**:
 - `model.pth`: The trained transformer model.
 - `training_loss.json`: Data gathered during model training (used to plot "loss over training batches").