tfa_output_2025_m02_d07_t07h_45m_38s

This model is a fine-tuned version of Qwen/Qwen2.5-0.5B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1965

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-06
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.PAGED_ADAMW with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
No log 0 0 1.3072
4.0374 0.0030 1 1.3072
3.7206 0.0060 2 1.3073
3.6304 0.0090 3 1.3075
3.9205 0.0119 4 1.3072
3.8914 0.0149 5 1.3074
3.7914 0.0179 6 1.3071
3.6888 0.0209 7 1.3073
3.9573 0.0239 8 1.3068
3.9216 0.0269 9 1.3068
3.8295 0.0299 10 1.3065
3.7775 0.0328 11 1.3061
4.1608 0.0358 12 1.3056
3.8315 0.0388 13 1.3047
3.8961 0.0418 14 1.3041
3.5064 0.0448 15 1.3034
3.8623 0.0478 16 1.3029
3.7882 0.0507 17 1.3022
3.9906 0.0537 18 1.3004
3.8721 0.0567 19 1.2990
3.8715 0.0597 20 1.2982
3.7604 0.0627 21 1.2970
3.7259 0.0657 22 1.2961
3.769 0.0687 23 1.2948
3.9388 0.0716 24 1.2935
3.8075 0.0746 25 1.2921
3.9561 0.0776 26 1.2913
3.8021 0.0806 27 1.2904
3.658 0.0836 28 1.2892
3.7469 0.0866 29 1.2883
3.9257 0.0896 30 1.2873
3.7349 0.0925 31 1.2864
3.7425 0.0955 32 1.2854
3.3611 0.0985 33 1.2843
3.5493 0.1015 34 1.2833
3.596 0.1045 35 1.2828
3.8972 0.1075 36 1.2816
3.9669 0.1104 37 1.2808
4.0005 0.1134 38 1.2800
3.6185 0.1164 39 1.2785
4.0852 0.1194 40 1.2776
3.452 0.1224 41 1.2769
4.017 0.1254 42 1.2761
3.8406 0.1284 43 1.2753
3.5601 0.1313 44 1.2746
3.7764 0.1343 45 1.2731
3.8586 0.1373 46 1.2722
3.2432 0.1403 47 1.2715
3.5002 0.1433 48 1.2706
3.4933 0.1463 49 1.2701
3.5798 0.1493 50 1.2688
3.6943 0.1522 51 1.2681
3.3713 0.1552 52 1.2675
3.6274 0.1582 53 1.2666
3.5537 0.1612 54 1.2653
3.5242 0.1642 55 1.2646
3.6243 0.1672 56 1.2637
3.4449 0.1701 57 1.2630
3.6649 0.1731 58 1.2621
3.643 0.1761 59 1.2615
3.5039 0.1791 60 1.2607
3.7135 0.1821 61 1.2595
3.8718 0.1851 62 1.2587
3.4509 0.1881 63 1.2581
3.7153 0.1910 64 1.2569
3.6279 0.1940 65 1.2564
3.1831 0.1970 66 1.2557
3.6647 0.2 67 1.2548
3.8362 0.2030 68 1.2543
3.7985 0.2060 69 1.2533
3.6422 0.2090 70 1.2525
3.4649 0.2119 71 1.2522
3.645 0.2149 72 1.2518
3.6387 0.2179 73 1.2508
3.6069 0.2209 74 1.2500
3.297 0.2239 75 1.2492
3.2928 0.2269 76 1.2487
3.4727 0.2299 77 1.2482
3.2704 0.2328 78 1.2473
3.4458 0.2358 79 1.2467
3.492 0.2388 80 1.2458
3.5288 0.2418 81 1.2453
3.6266 0.2448 82 1.2447
3.6181 0.2478 83 1.2442
3.4847 0.2507 84 1.2434
3.7349 0.2537 85 1.2431
3.7247 0.2567 86 1.2424
3.3359 0.2597 87 1.2419
3.3628 0.2627 88 1.2408
3.6579 0.2657 89 1.2408
3.601 0.2687 90 1.2406
3.1941 0.2716 91 1.2399
3.5671 0.2746 92 1.2395
3.7115 0.2776 93 1.2385
3.532 0.2806 94 1.2381
3.5191 0.2836 95 1.2374
3.6731 0.2866 96 1.2371
3.7962 0.2896 97 1.2367
3.7644 0.2925 98 1.2361
3.4904 0.2955 99 1.2356
3.5935 0.2985 100 1.2354
3.577 0.3015 101 1.2346
3.693 0.3045 102 1.2344
3.6223 0.3075 103 1.2340
3.4485 0.3104 104 1.2334
3.4748 0.3134 105 1.2329
3.4188 0.3164 106 1.2324
3.5004 0.3194 107 1.2321
3.6142 0.3224 108 1.2317
3.3817 0.3254 109 1.2313
3.5531 0.3284 110 1.2311
3.1087 0.3313 111 1.2306
3.3683 0.3343 112 1.2304
3.8721 0.3373 113 1.2298
3.7024 0.3403 114 1.2293
3.5345 0.3433 115 1.2289
3.573 0.3463 116 1.2287
3.5846 0.3493 117 1.2284
3.5404 0.3522 118 1.2282
3.5606 0.3552 119 1.2280
3.5055 0.3582 120 1.2274
3.4956 0.3612 121 1.2270
3.608 0.3642 122 1.2268
3.2361 0.3672 123 1.2265
3.488 0.3701 124 1.2265
3.2155 0.3731 125 1.2260
3.3639 0.3761 126 1.2256
3.3634 0.3791 127 1.2256
3.4846 0.3821 128 1.2254
3.5508 0.3851 129 1.2252
3.5719 0.3881 130 1.2246
3.3178 0.3910 131 1.2247
3.4262 0.3940 132 1.2243
3.4895 0.3970 133 1.2239
3.5872 0.4 134 1.2234
3.4766 0.4030 135 1.2233
3.3568 0.4060 136 1.2230
3.5747 0.4090 137 1.2230
2.9403 0.4119 138 1.2226
3.3463 0.4149 139 1.2223
3.3048 0.4179 140 1.2217
3.2987 0.4209 141 1.2218
3.4811 0.4239 142 1.2216
3.64 0.4269 143 1.2211
3.1703 0.4299 144 1.2206
3.2906 0.4328 145 1.2205
3.6134 0.4358 146 1.2199
3.348 0.4388 147 1.2197
3.2428 0.4418 148 1.2198
3.573 0.4448 149 1.2197
3.6921 0.4478 150 1.2191
3.5082 0.4507 151 1.2191
3.3445 0.4537 152 1.2189
3.3521 0.4567 153 1.2187
3.4538 0.4597 154 1.2183
3.3225 0.4627 155 1.2179
3.6838 0.4657 156 1.2179
3.3113 0.4687 157 1.2177
3.4141 0.4716 158 1.2175
2.8407 0.4746 159 1.2173
3.3262 0.4776 160 1.2168
3.6585 0.4806 161 1.2167
3.4489 0.4836 162 1.2166
3.6269 0.4866 163 1.2162
3.3826 0.4896 164 1.2160
3.4506 0.4925 165 1.2159
3.2749 0.4955 166 1.2151
3.7262 0.4985 167 1.2151
3.4615 0.5015 168 1.2147
3.5982 0.5045 169 1.2149
3.5301 0.5075 170 1.2142
3.1629 0.5104 171 1.2145
3.3415 0.5134 172 1.2140
3.2653 0.5164 173 1.2139
3.2757 0.5194 174 1.2137
3.3495 0.5224 175 1.2136
3.4542 0.5254 176 1.2135
3.5153 0.5284 177 1.2130
3.2836 0.5313 178 1.2126
3.2877 0.5343 179 1.2123
3.4662 0.5373 180 1.2125
3.0825 0.5403 181 1.2123
3.381 0.5433 182 1.2121
3.3843 0.5463 183 1.2118
3.0211 0.5493 184 1.2116
3.2045 0.5522 185 1.2113
3.515 0.5552 186 1.2111
3.3176 0.5582 187 1.2113
3.5145 0.5612 188 1.2109
3.135 0.5642 189 1.2106
3.5442 0.5672 190 1.2107
3.3991 0.5701 191 1.2106
3.2577 0.5731 192 1.2103
3.286 0.5761 193 1.2102
3.4492 0.5791 194 1.2097
3.7012 0.5821 195 1.2098
3.2023 0.5851 196 1.2097
3.207 0.5881 197 1.2097
3.3281 0.5910 198 1.2091
3.3071 0.5940 199 1.2091
3.39 0.5970 200 1.2091
3.2437 0.6 201 1.2090
3.2771 0.6030 202 1.2089
3.4758 0.6060 203 1.2085
3.2785 0.6090 204 1.2082
3.524 0.6119 205 1.2084
3.3163 0.6149 206 1.2082
3.3725 0.6179 207 1.2078
3.4803 0.6209 208 1.2078
3.1456 0.6239 209 1.2080
3.2719 0.6269 210 1.2079
3.4017 0.6299 211 1.2074
3.3843 0.6328 212 1.2073
3.5353 0.6358 213 1.2070
3.2673 0.6388 214 1.2072
3.2824 0.6418 215 1.2070
3.3562 0.6448 216 1.2069
3.1972 0.6478 217 1.2069
3.3425 0.6507 218 1.2067
3.1529 0.6537 219 1.2066
3.0392 0.6567 220 1.2062
2.9891 0.6597 221 1.2063
3.451 0.6627 222 1.2059
3.19 0.6657 223 1.2061
3.2682 0.6687 224 1.2057
3.4182 0.6716 225 1.2055
3.4638 0.6746 226 1.2054
3.2297 0.6776 227 1.2053
3.3207 0.6806 228 1.2051
3.2218 0.6836 229 1.2051
3.2048 0.6866 230 1.2049
3.1883 0.6896 231 1.2050
3.5168 0.6925 232 1.2048
3.4838 0.6955 233 1.2047
3.148 0.6985 234 1.2050
3.35 0.7015 235 1.2045
3.156 0.7045 236 1.2043
3.2269 0.7075 237 1.2045
3.1877 0.7104 238 1.2039
3.4361 0.7134 239 1.2041
3.2583 0.7164 240 1.2039
3.3384 0.7194 241 1.2037
3.235 0.7224 242 1.2038
3.2664 0.7254 243 1.2037
3.1289 0.7284 244 1.2033
3.4714 0.7313 245 1.2035
3.4815 0.7343 246 1.2035
3.5291 0.7373 247 1.2033
3.2485 0.7403 248 1.2032
3.3338 0.7433 249 1.2028
3.0806 0.7463 250 1.2030
3.3287 0.7493 251 1.2027
3.4036 0.7522 252 1.2025
3.2587 0.7552 253 1.2024
2.8553 0.7582 254 1.2025
3.3085 0.7612 255 1.2022
3.0679 0.7642 256 1.2023
3.2612 0.7672 257 1.2022
3.5312 0.7701 258 1.2021
3.2655 0.7731 259 1.2018
3.054 0.7761 260 1.2018
3.306 0.7791 261 1.2017
3.2836 0.7821 262 1.2015
2.992 0.7851 263 1.2014
3.2031 0.7881 264 1.2013
3.4771 0.7910 265 1.2013
3.2433 0.7940 266 1.2011
3.0587 0.7970 267 1.2012
3.0495 0.8 268 1.2012
3.3093 0.8030 269 1.2008
3.3915 0.8060 270 1.2010
3.345 0.8090 271 1.2008
3.1808 0.8119 272 1.2006
3.209 0.8149 273 1.2007
3.0482 0.8179 274 1.2004
2.9668 0.8209 275 1.2006
3.0532 0.8239 276 1.2008
3.3684 0.8269 277 1.2004
3.2661 0.8299 278 1.2003
3.0364 0.8328 279 1.2003
3.4541 0.8358 280 1.2002
3.3404 0.8388 281 1.2001
3.4284 0.8418 282 1.1999
3.4654 0.8448 283 1.1997
3.0229 0.8478 284 1.2000
3.2988 0.8507 285 1.1999
3.3894 0.8537 286 1.1995
3.2594 0.8567 287 1.1995
3.2245 0.8597 288 1.1995
3.0186 0.8627 289 1.1991
3.0315 0.8657 290 1.1990
2.8311 0.8687 291 1.1990
3.1816 0.8716 292 1.1988
3.3245 0.8746 293 1.1987
3.434 0.8776 294 1.1988
3.09 0.8806 295 1.1986
3.4151 0.8836 296 1.1983
3.2193 0.8866 297 1.1984
3.3492 0.8896 298 1.1985
3.1033 0.8925 299 1.1983
3.3869 0.8955 300 1.1984
3.2651 0.8985 301 1.1981
3.3921 0.9015 302 1.1981
3.3988 0.9045 303 1.1981
3.3168 0.9075 304 1.1983
3.173 0.9104 305 1.1980
3.1127 0.9134 306 1.1981
2.9588 0.9164 307 1.1979
3.1549 0.9194 308 1.1978
3.1344 0.9224 309 1.1978
3.0571 0.9254 310 1.1977
3.299 0.9284 311 1.1976
3.3548 0.9313 312 1.1975
3.1675 0.9343 313 1.1975
3.2788 0.9373 314 1.1972
3.1682 0.9403 315 1.1975
3.6201 0.9433 316 1.1973
2.9316 0.9463 317 1.1973
3.2871 0.9493 318 1.1970
3.0758 0.9522 319 1.1973
3.0506 0.9552 320 1.1972
3.0749 0.9582 321 1.1971
3.0325 0.9612 322 1.1969
3.0342 0.9642 323 1.1969
3.2879 0.9672 324 1.1969
2.9969 0.9701 325 1.1967
3.4164 0.9731 326 1.1967
3.3712 0.9761 327 1.1966
3.1877 0.9791 328 1.1967
3.3259 0.9821 329 1.1968
3.2571 0.9851 330 1.1965
3.0292 0.9881 331 1.1968
3.2312 0.9910 332 1.1966
3.3412 0.9940 333 1.1965
3.2653 0.9970 334 1.1966
3.3735 1.0 335 1.1965

Framework versions

  • Transformers 4.48.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
0
Safetensors
Model size
494M params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for brando/tfa_output_2025_m02_d07_t07h_45m_38s

Base model

Qwen/Qwen2.5-0.5B
Finetuned
(107)
this model