PursuitOfDataScience commited on
Commit
5e191d2
Β·
1 Parent(s): 349443b

modified model card

Browse files
Files changed (1) hide show
  1. README.md +2 -3
README.md CHANGED
@@ -25,15 +25,14 @@ Training was conducted on a single NVIDIA DGX node:
25
 
26
  - Hardware: **8Γ— NVIDIA A100 GPUs (80GB HBM each)**
27
  - Training duration: **1440 GPU hours** (~180 hours wall-clock on 8 GPUs)
28
- - Total epochs: **5 epochs**
29
 
30
  ### πŸ“ˆ Training Loss Curve
31
 
32
  Here's the training loss progression:
33
 
34
- ```markdown
35
  ![Training Loss Curve](../plots/pretrain_loss_20250303.png)
36
- ```
37
 
38
  ### πŸ“ Example Outputs
39
  Below are generated examples illustrating Argonne-1.0's style and capability when prompted:
 
25
 
26
  - Hardware: **8Γ— NVIDIA A100 GPUs (80GB HBM each)**
27
  - Training duration: **1440 GPU hours** (~180 hours wall-clock on 8 GPUs)
28
+ - Total steps: **160,000 global steps**
29
 
30
  ### πŸ“ˆ Training Loss Curve
31
 
32
  Here's the training loss progression:
33
 
 
34
  ![Training Loss Curve](../plots/pretrain_loss_20250303.png)
35
+
36
 
37
  ### πŸ“ Example Outputs
38
  Below are generated examples illustrating Argonne-1.0's style and capability when prompted: