can you provide wikitest ppl and c4 ppl separately?
#11 opened 9 months ago
by
sheropen-2
Can you provide more details on the training?
1
#10 opened 9 months ago
by
dequ777
Any plans to use MQA (multi-query attention) or GQA (grouped-query attention) in the future?
#9 opened 10 months ago
by
graefics
Efficient Inference Kernel Support for 1.58bit.
#8 opened 10 months ago
by
LeiWang1999
![](https://cdn-avatars.huggingface.co/v1/production/uploads/646c7b3849a37754bdeefe8e/tejirZfCHm6EqhXOdnMUq.jpeg)
This code from BitLinear doesn't make sense
1
#7 opened 10 months ago
by
qmsoqm
![](https://cdn-avatars.huggingface.co/v1/production/uploads/64b1e1da7da6a1dca863fe06/gwXDPs_vyniw-dT2mLz7g.png)
Is it bitnet {-1,0,1}?
4
#6 opened 10 months ago
by
Remek
![](https://cdn-avatars.huggingface.co/v1/production/uploads/61786d0b038518aa2827c6b7/d1UnfivoVreYebS5JM3P9.jpeg)
ValueError: Tokenizer class BitnetTokenizer does not exist or is not currently imported.
4
#5 opened 10 months ago
by
RZJournal
Longer inference time
2
#4 opened 11 months ago
by
dittops
![](https://cdn-avatars.huggingface.co/v1/production/uploads/6426ada943fc9833932542bd/fvZGAVsCUQPbRsDf7Mh7L.png)
Why are these models fp32?
5
#2 opened 11 months ago
by
supercharge19
Is there a chat/instruct model in plans?
2
#1 opened 11 months ago
by
MrVodnik