PursuitOfDataScience commited on
Commit
349443b
Β·
1 Parent(s): b66e34a

Add model card (README.md)

Browse files
Files changed (1) hide show
  1. README.md +54 -0
README.md ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Argonne-1.0 🌌
2
+
3
+ **Argonne-1.0** is a language model pretrained from scratch on the **Fineweb-Edu dataset**, specifically designed for research, educational purposes, and exploration into language modeling.
4
+
5
+ ## πŸ“š Dataset
6
+
7
+ The model is pretrained on [**Fineweb-Edu (CC-MAIN-2024-10)**](https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu), an open and comprehensive dataset consisting of educational web content.
8
+
9
+ ## πŸš€ Model Architecture
10
+
11
+ Argonne-1.0 is a GPT-like autoregressive transformer model with approximately **276M parameters**:
12
+
13
+ | Hyperparameter | Value |
14
+ |----------------|-------|
15
+ | `n_layer` | 12 |
16
+ | `n_head` | 12 |
17
+ | `n_embd` | 1296 |
18
+ | `block_size` | 2048 |
19
+ | `dropout` | 0.1 |
20
+ | **Parameters** | **275,827,680** |
21
+
22
+ ## πŸ› οΈ Training Details
23
+
24
+ Training was conducted on a single NVIDIA DGX node:
25
+
26
+ - Hardware: **8Γ— NVIDIA A100 GPUs (80GB HBM each)**
27
+ - Training duration: **1440 GPU hours** (~180 hours wall-clock on 8 GPUs)
28
+ - Total epochs: **5 epochs**
29
+
30
+ ### πŸ“ˆ Training Loss Curve
31
+
32
+ Here's the training loss progression:
33
+
34
+ ```markdown
35
+ ![Training Loss Curve](../plots/pretrain_loss_20250303.png)
36
+ ```
37
+
38
+ ### πŸ“ Example Outputs
39
+ Below are generated examples illustrating Argonne-1.0's style and capability when prompted:
40
+
41
+ Prompt: "The meaning of life is..."
42
+
43
+ ```
44
+ The meaning of life is tantamount to an inescapable reality. It can be seen as an inescapable reality where life is lived in a vacuum, or a mere absence of life. Life can be considered as the ultimate reality, where life is no more, where life has no purpose, and life has no meaning.
45
+ Life is a form of art, rather than a mere collection or an endless expanse. It is a realm where art, music, philosophy, philosophy, and science come together to create something new, beautiful, and meaningful. It is the boundlessness of existence that creates the essence of art, music, philosophy and science.
46
+ So, what does a life mean? It means something
47
+ ```
48
+
49
+ Prompt: "In the future, artificial intelligence will..."
50
+ ```
51
+ In the future, artificial intelligence will tame the need for new ways to understand and control our lives. AI is already being used to do tasks that previously took human intelligence. But is it possible to predict what will come in the future, what will happen in the future, and how much will we be willing to pay for AI?
52
+ Evolutionary scientists have been developing new technologies that can be used to create artificial intelligence. For example, AI algorithms can be used to detect objects in a scene. These algorithms have been used in the design and manufacturing of many different products.
53
+ Similarly, AI algorithms can be used to predict the future by analyzing historical data and patterns in it. This information can be used to predict the future and make predictions accordingly.
54
+ ```