Potential Issue in MoA Routing Logic
#2
by
lincorn
- opened
In the MoA routing logic of modeling_adaptor.py, routing indices are incorrectly overwritten:
topk_indices = cluster_indices # Correct indices captured here
topk_indices = torch.zeros_like(topk_indices, device=hidden_states.device) # Overwritten with zeros tensor
Here, topk_indices is first assigned valid cluster indices but then immediately overwritten by a tensor of zeros.