willtensora
/

test-repo124

@@ -76,7 +76,7 @@ xformers_attention: true
 This model is a fine-tuned version of [fxmarty/small-llama-testing](https://huggingface.co/fxmarty/small-llama-testing) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 6.0837
 ## Model description
@@ -113,50 +113,50 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
 | No log        | 0.0112 | 1    | 10.4228         |
-| 10.127        | 0.2247 | 20   | 9.8631          |
-| 9.0392        | 0.4494 | 40   | 8.7402          |
-| 8.1127        | 0.6742 | 60   | 7.9189          |
-| 7.5512        | 0.8989 | 80   | 7.4579          |
-| 7.2769        | 1.1236 | 100  | 7.2771          |
-| 7.1384        | 1.3483 | 120  | 7.1768          |
-| 7.0578        | 1.5730 | 140  | 7.0576          |
-| 6.9566        | 1.7978 | 160  | 6.9379          |
-| 6.8784        | 2.0225 | 180  | 6.8207          |
-| 6.7025        | 2.2472 | 200  | 6.7209          |
-| 6.5909        | 2.4719 | 220  | 6.6360          |
-| 6.4977        | 2.6966 | 240  | 6.5571          |
-| 6.445         | 2.9213 | 260  | 6.4721          |
-| 6.2635        | 3.1461 | 280  | 6.4128          |
-| 6.2356        | 3.3708 | 300  | 6.3661          |
-| 6.2742        | 3.5955 | 320  | 6.3160          |
-| 6.2487        | 3.8202 | 340  | 6.2879          |
-| 6.175         | 4.0449 | 360  | 6.2412          |
-| 6.051         | 4.2697 | 380  | 6.2180          |
-| 6.0159        | 4.4944 | 400  | 6.1959          |
-| 6.004         | 4.7191 | 420  | 6.1758          |
-| 6.0235        | 4.9438 | 440  | 6.1537          |
-| 5.9599        | 5.1685 | 460  | 6.1438          |
-| 6.0208        | 5.3933 | 480  | 6.1282          |
-| 5.9372        | 5.6180 | 500  | 6.1229          |
-| 5.8251        | 5.8427 | 520  | 6.1083          |
-| 5.875         | 6.0674 | 540  | 6.1066          |
-| 5.8374        | 6.2921 | 560  | 6.1013          |
-| 5.8515        | 6.5169 | 580  | 6.0970          |
-| 5.9295        | 6.7416 | 600  | 6.1001          |
-| 5.8929        | 6.9663 | 620  | 6.0978          |
-| 5.8917        | 7.1910 | 640  | 6.0958          |
-| 5.8256        | 7.4157 | 660  | 6.0939          |
-| 5.8201        | 7.6404 | 680  | 6.0887          |
-| 5.8619        | 7.8652 | 700  | 6.0871          |
-| 5.8934        | 8.0899 | 720  | 6.0946          |
-| 5.868         | 8.3146 | 740  | 6.0894          |
-| 5.9062        | 8.5393 | 760  | 6.0878          |
-| 5.8131        | 8.7640 | 780  | 6.0883          |
-| 5.7868        | 8.9888 | 800  | 6.0904          |
-| 5.8077        | 9.2135 | 820  | 6.0926          |
-| 5.8301        | 9.4382 | 840  | 6.0845          |
-| 5.8489        | 9.6629 | 860  | 6.0890          |
-| 5.8501        | 9.8876 | 880  | 6.0837          |
 ### Framework versions

 This model is a fine-tuned version of [fxmarty/small-llama-testing](https://huggingface.co/fxmarty/small-llama-testing) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 6.0841
 ## Model description
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
 | No log        | 0.0112 | 1    | 10.4228         |
+| 10.127        | 0.2247 | 20   | 9.8630          |
+| 9.039         | 0.4494 | 40   | 8.7400          |
+| 8.1126        | 0.6742 | 60   | 7.9190          |
+| 7.5515        | 0.8989 | 80   | 7.4582          |
+| 7.2771        | 1.1236 | 100  | 7.2773          |
+| 7.1388        | 1.3483 | 120  | 7.1772          |
+| 7.0583        | 1.5730 | 140  | 7.0581          |
+| 6.957         | 1.7978 | 160  | 6.9383          |
+| 6.8789        | 2.0225 | 180  | 6.8210          |
+| 6.7029        | 2.2472 | 200  | 6.7213          |
+| 6.5913        | 2.4719 | 220  | 6.6364          |
+| 6.4981        | 2.6966 | 240  | 6.5572          |
+| 6.4453        | 2.9213 | 260  | 6.4724          |
+| 6.2642        | 3.1461 | 280  | 6.4135          |
+| 6.2365        | 3.3708 | 300  | 6.3668          |
+| 6.2746        | 3.5955 | 320  | 6.3166          |
+| 6.2488        | 3.8202 | 340  | 6.2882          |
+| 6.1749        | 4.0449 | 360  | 6.2413          |
+| 6.0514        | 4.2697 | 380  | 6.2183          |
+| 6.0162        | 4.4944 | 400  | 6.1961          |
+| 6.0043        | 4.7191 | 420  | 6.1763          |
+| 6.0239        | 4.9438 | 440  | 6.1544          |
+| 5.9605        | 5.1685 | 460  | 6.1444          |
+| 6.0214        | 5.3933 | 480  | 6.1288          |
+| 5.9376        | 5.6180 | 500  | 6.1234          |
+| 5.8258        | 5.8427 | 520  | 6.1087          |
+| 5.8755        | 6.0674 | 540  | 6.1070          |
+| 5.838         | 6.2921 | 560  | 6.1017          |
+| 5.852         | 6.5169 | 580  | 6.0974          |
+| 5.9298        | 6.7416 | 600  | 6.1005          |
+| 5.8933        | 6.9663 | 620  | 6.0981          |
+| 5.8921        | 7.1910 | 640  | 6.0962          |
+| 5.8261        | 7.4157 | 660  | 6.0942          |
+| 5.8207        | 7.6404 | 680  | 6.0891          |
+| 5.8623        | 7.8652 | 700  | 6.0875          |
+| 5.8938        | 8.0899 | 720  | 6.0950          |
+| 5.8684        | 8.3146 | 740  | 6.0897          |
+| 5.9065        | 8.5393 | 760  | 6.0881          |
+| 5.8135        | 8.7640 | 780  | 6.0886          |
+| 5.7873        | 8.9888 | 800  | 6.0908          |
+| 5.8081        | 9.2135 | 820  | 6.0929          |
+| 5.8307        | 9.4382 | 840  | 6.0849          |
+| 5.8492        | 9.6629 | 860  | 6.0893          |
+| 5.8504        | 9.8876 | 880  | 6.0841          |
 ### Framework versions

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:792a629f10a6767c348ead6e69da42089e762a50be74f5c86c6ca4960d976065
 size 34219693

 version https://git-lfs.github.com/spec/v1
+oid sha256:e0ca7b52c986706bf756645d07c30afd308c503a6107eb9b2a833f383cecd660
 size 34219693