output is merely copy of input for 70b @ webui

#13
by wholehope - opened

Can anybody enlighten me how to inference 70b-GPTQ model (chat or non-chat) using oobabooga/text-generation-webui? No matter I use the LLaMa-v2 instruct mentioned on the model card or just plain prompt, the output is always the exact copy of input. In the same webui, I can inference 13b/7b-GPTQ (chat or non-chat) without any problem.

Also have issues in textgen webui, no tokens generated, only in the chat interface the other one works.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment