Spacing in Colab Notebook Causing Generation Issues

by chris-santiago - opened Feb 11

Feb 11

I tried running the notebook, as is, in Colab. However, the cells running the dummy agent returned gibberish that was unrelated to weather. I introduced a link break at the beginning of this prompt:

# Since we are running the "text_generation", we need to add the right special tokens.
prompt=f"""
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
{SYSTEM_PROMPT}
<|eot_id|><|start_header_id|>user<|end_header_id|>
What's the weather in London ?
<|eot_id|><|start_header_id|>assistant<|end_header_id|>
"""

and a whitespace in this prompt:

# Let's concatenate the base prompt, the completion until function execution and the result of the function as an Observation
new_prompt=prompt+output+f" {get_weather('London')}"
print(new_prompt)

After these changes, the output in Colab environment matched that of the course content.

hassaanbinaslam

Feb 12

•

edited Feb 12

Oh man, I was completely confused and couldn't figure out what was different or wrong when running this notebook on my end.

Thanks for pointing it out!

Adding line breaks and whitespaces does solve the problem.

Though, it's very strange that without line breaks, it returns total gibberish.

P.S. On the course discord channel people suggested "Meta-Llama-3-8B-Instruct" works great without making any changes to the notebook. I have switched to it and did not find any more issues.

chris-santiago

Feb 12

Yeah some LLMs (or their earlier versions) are extremely sensitive to spacing/breaks.

LoganP9

Feb 13

Confirming that changing the second cell to use the above mentioned model resolves issues with no further changes:

client = InferenceClient("meta-llama/Meta-Llama-3-8B-Instruct")

Jofthomas

Hugging Face Agents Course org Feb 13

while there is is indeed a little spacing issue, the biggest is the model behind the api for llama3.2-3b-instruct is down.

I replaced it with an inference endpoint for the time being.
client = InferenceClient("https://jc26mwg228mkj8dw.us-east-1.aws.endpoints.huggingface.cloud")

LoganP9

Feb 13

@Jofthomas Not doubting that was an issue, but if the model were down wouldn't we get an error rather than gibberish or unexpected output? Perhaps a different model took the url we had set previously for the InferenceClient while the main one was down?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment