Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
"], return_tensors="pt").to("cuda")
LLM + greedy decoding = repetitive, boring output
generated_ids = model.generate(**model_inputs)
tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
'I am a cat.