Hi, I'm traing the deepseek-ai/deepseek-vl2 model and find that the default top_k method is noaux_tc. However, line 468 in modeling_deepseek.py shows that noaux_tc is not supported for traing. I wonder why.
Your need to confirm your account before you can post a new comment.