Update README.md
Browse files
README.md
CHANGED
@@ -71,8 +71,9 @@ curl http://127.0.0.1:9000/loadmodel -X POST -H "Content-Type: application/json"
|
|
71 |
## Training Infrastructure
|
72 |
- Hardware: Single AMD MI300X GPU
|
73 |
- Training Duration: 3 days(Stage1,2)
|
74 |
-
- Stage1 180MToken
|
75 |
-
- Stage2 160MToken (
|
|
|
76 |
- Stage3 1GToken(TBD)
|
77 |
|
78 |
## Acknowledgements
|
|
|
71 |
## Training Infrastructure
|
72 |
- Hardware: Single AMD MI300X GPU
|
73 |
- Training Duration: 3 days(Stage1,2)
|
74 |
+
- Stage1 180MToken (LR1e-4)
|
75 |
+
- Stage2 160MToken (Temperature 1.0KD LR5e-6)
|
76 |
+
- Stage2.5 TBD (Temperature 2.0KD LR3e-5)
|
77 |
- Stage3 1GToken(TBD)
|
78 |
|
79 |
## Acknowledgements
|