Patching hf bug that creates wrong cache length if only inputs_embeds are passed to the model

#19
by tomer-nv - opened
NVIDIA org
No description provided.
itlevy changed pull request status to merged
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment