I'm getting the same errors as the person above on the demo site. Must be a bug as I tried different prompts and had to wait ~1hr for each one due to the queue:
Error in generating model output:
litellm.ContextWindowExceededError: litellm.BadRequestError: ContextWindowExceededError: OpenAIException - Error code: 400 - {'error': {'message': "This model's maximum context length is 128000 tokens. However, your messages resulted in 709582 tokens. Please reduce the length of the messages.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}
Jonas D
jonwondo
AI & ML interests
None yet
Recent Activity
commented on
an
article
9 days ago
Open-source DeepResearch – Freeing our search agents
new activity
about 1 year ago
turboderp/Mixtral-8x7B-instruct-exl2:bpw and corresponding vram usage
Organizations
None yet
jonwondo's activity
commented on
Open-source DeepResearch – Freeing our search agents
9 days ago
bpw and corresponding vram usage
18
#1 opened about 1 year ago
by
joujiboi
![](https://cdn-avatars.huggingface.co/v1/production/uploads/631608b395b55e2621d033a8/hwcbRiLms6AyzUhGIx-fC.png)