Text Generation
Transformers
PyTorch
llama
text-generation-inference
Inference Endpoints

inference speed is considerably slow

#11
by sonald - opened

compare to ther 13B models, this model is quite slow, any ideas on why ?

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment