TypeError: forward() got an unexpected keyword argument 'num_logits_to_keep'
#51 opened 6 months ago
by
shajiu
![](https://cdn-avatars.huggingface.co/v1/production/uploads/64ca16f40986bd6fa2665f15/raKj75mjwXZtjmg5dJxau.jpeg)
Adding Evaluation Results
#50 opened 6 months ago
by
leaderboard-pr-bot
![](https://cdn-avatars.huggingface.co/v1/production/uploads/655506df9dc61e22c5f9c732/IZGvup0FdVlioPPIPnzZv.jpeg)
AttributeError: 'HybridMambaAttentionDynamicCache' object has no attribute '_modules'
7
#48 opened 6 months ago
by
xxrjun
![](https://cdn-avatars.huggingface.co/v1/production/uploads/63f8fbcfaadb0790bb5f4667/BP9sqwck8zgQTbfTmhRRD.jpeg)
Adding Evaluation Results
#47 opened 7 months ago
by
leaderboard-pr-bot
![](https://cdn-avatars.huggingface.co/v1/production/uploads/655506df9dc61e22c5f9c732/IZGvup0FdVlioPPIPnzZv.jpeg)
ai21 instance not runnable with langchain
1
#45 opened 8 months ago
by
LordSahu
![](https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/Ilo5v8ht5G-NNsm_sLi2v.jpeg)
Is there any SFT or Chat model?
2
#41 opened 10 months ago
by
chuyi777
How to use accelerate evaluate Jamba
#40 opened 10 months ago
by
Xidong
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1678693486092-640ed3e9f2d7c41a1e9a9fde.jpeg)
Jamba Evaluation Task on GSM8K
#39 opened 10 months ago
by
ssparks
Do you have plans to release papers on Jamba's architecture or miniature models?
#38 opened 10 months ago
by
badrabbitt
Are there any weight files for pre-trained models?
#37 opened 10 months ago
by
aidenxy
Memory usage on single A100*80GB in training
#36 opened 10 months ago
by
DavidWu1116
Fast Mamba
5
#34 opened 10 months ago
by
Praneethkeerthi
Why does throughput increase with longer context window?
3
#33 opened 10 months ago
by
jingyu-q
Request: DOI
#32 opened 10 months ago
by
kozolex
![](https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/G1CXDrIlYXztzw-OpfeRX.jpeg)
GGUF quants?
1
#31 opened 10 months ago
by
6346y9uey
Any release plans for the 7b jamba model without MoE?
2
#30 opened 11 months ago
by
danielpark
Why is there an MLP in the Mamba Layer?
#28 opened 11 months ago
by
naston
Complex vs Real parametrization.
#27 opened 11 months ago
by
Yutida
How to Fine-tune Jamba on google Colab?
7
#26 opened 11 months ago
by
Ateeqq
![](https://cdn-avatars.huggingface.co/v1/production/uploads/65b04ef37c11edbf6e39f4bb/26Eb8KkHuFZwvlsyL-Hhc.jpeg)
Layer-Selective Rank Reduction
#25 opened 11 months ago
by
mizinovmv
![](https://cdn-avatars.huggingface.co/v1/production/uploads/64c14d45ea792b1950e02718/GNBxE96ZuDdyk3UjZLz8i.jpeg)
Update README.md
#23 opened 11 months ago
by
rombodawg
![](https://cdn-avatars.huggingface.co/v1/production/uploads/642cc1c253e76b4c2286c58e/fGtQ_QeTjUgBhIT89dpUt.jpeg)
Would there a chance Jamba to be train in 1.58bit weight?
1
#22 opened 11 months ago
by
shing3232
Anyone else currently experimenting with fine-tuning Jamba?
3
#21 opened 11 months ago
by
Severian
![](https://cdn-avatars.huggingface.co/v1/production/uploads/64740cf7485a7c8e1bd51ac9/qxcEOKY4M36ggryaoc4L-.jpeg)
IndentationError: unindent does not match any outer indentation level
#19 opened 11 months ago
by
thebeline
ModuleNotFoundError: No module named 'transformers_modules.ai21labs.Jamba-v0'
5
#17 opened 11 months ago
by
hjewr
Fast Mamba kernels are not available
10
#16 opened 11 months ago
by
MohamedRashad
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1628885133347-6116d0584ef9fdfbf45dc4d9.jpeg)
does all safe tensors needed to be downloaded to use this model on colab?
2
#14 opened 11 months ago
by
Kv-boii
How many pretraining tokens?
#13 opened 11 months ago
by
CyberNative
![](https://cdn-avatars.huggingface.co/v1/production/uploads/64a43122e3cf200cbf8a03b3/xilplM2M8Sjn3jLGyvwma.jpeg)
Smaller version to ease implementation experiments?
7
#12 opened 11 months ago
by
compilade
![](https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/4Az8a8F60rNOD3L3ThsCe.png)
Coding performance of base model?
4
#11 opened 11 months ago
by
rombodawg
![](https://cdn-avatars.huggingface.co/v1/production/uploads/642cc1c253e76b4c2286c58e/fGtQ_QeTjUgBhIT89dpUt.jpeg)
Can you give a short explanation about the benefits and the architecture?
2
#7 opened 11 months ago
by
SicariusSicariiStuff
![](https://cdn-avatars.huggingface.co/v1/production/uploads/6569216f9c96f1a47bf45788/mCLqmAs4dOjKdxNQVAp1w.png)
A Bang Up Job
2
#4 opened 11 months ago
by
nightvision04
multiple gpu?
3
#3 opened 11 months ago
by
bdambrosio
![](https://cdn-avatars.huggingface.co/v1/production/uploads/641b67291911d3be67457cea/1idJEo5LHZ_XKJJUKssen.jpeg)
Just a solid congrats and thank you to your team
1
#1 opened 11 months ago
by
Severian
![](https://cdn-avatars.huggingface.co/v1/production/uploads/64740cf7485a7c8e1bd51ac9/qxcEOKY4M36ggryaoc4L-.jpeg)