InstaDeepAI
/

ChatNT

Text Generation

feature-extraction

Model card Files Files and versions Community

Yanisadel commited on 9 days ago

Commit

6d6a20f

·

verified ·

1 Parent(s): 3bd05b8

Update chatNT.py

Files changed (1) hide show

chatNT.py +3 -0

chatNT.py CHANGED Viewed

@@ -1324,6 +1324,9 @@ class MultiHeadAttention(nn.Module):
         attention_weights = attention_weights / sqrt_key_size
         if attention_mask is not None:
             attention_weights = torch.where(attention_mask, attention_weights, -1e30)
         if attention_weight_bias is not None:
             attention_weights = F.softmax(
                 attention_weights + attention_weight_bias, dim=-1

         attention_weights = attention_weights / sqrt_key_size
         if attention_mask is not None:
             attention_weights = torch.where(attention_mask, attention_weights, -1e30)
+        attention_weights = attention_weights.to(value_heads.dtype)
         if attention_weight_bias is not None:
             attention_weights = F.softmax(
                 attention_weights + attention_weight_bias, dim=-1