nvidia
/

Hymba-1.5B-Base

Text Generation

Model card Files Files and versions Community

pmolchanov commited on Nov 23, 2024

Commit

7115e65

·

verified ·

1 Parent(s): 6015b31

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -14,7 +14,7 @@ The model has hybrid architecture with Mamba and Attention heads running in para
 This model is ready for commercial use.
-**[Model Weights Coming Soon]**
 **[Caution] During generation, the batch size needs to be 1. Our current implementation does not fully support padding of Meta tokens + SWA; this is a work in progress. Training and pre-filling support any batch size.**

 This model is ready for commercial use.
+**[Model Weights Coming Soon, expected Nov 25th]**
 **[Caution] During generation, the batch size needs to be 1. Our current implementation does not fully support padding of Meta tokens + SWA; this is a work in progress. Training and pre-filling support any batch size.**