torch.triu(input, diagonal=0, out=None)

本文详细介绍了 PyTorch 中的 triu 函数,该函数用于从输入矩阵中提取上三角部分,并将剩余部分设为零。文章通过实例展示了如何使用 diagonal 参数来控制上三角部分的具体范围。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

torch.triu(input, diagonal=0, out=None) → Tensor
返回矩阵上三角部分,其余部分定义为0。
Parameters:

  • input (Tensor) – the input tensor
  • diagonal (int, optional) – the diagonal to consider
  • out (Tensor, optional) – the output tensor
    如果diagonal为空,输入矩阵保留主对角线与主对角线以上的元素;
    如果diagonal为正数n,输入矩阵保留主对角线与主对角线以上除去n行的元素;
    如果diagonal为负数-n,输入矩阵保留主对角线与主对角线以上与主对角线下方h行对角线的元素;
>>> a = torch.randn(3, 3)
>>> a
tensor([[ 0.2309,  0.5207,  2.0049],
        [ 0.2072, -1.0680,  0.6602],
        [ 0.3480, -0.5211, -0.4573]])
>>> torch.triu(a)
tensor([[ 0.2309,  0.5207,  2.0049],
        [ 0.0000, -1.0680,  0.6602],
        [ 0.0000,  0.0000, -0.4573]])
>>> torch.triu(a, diagonal=1)
tensor([[ 0.0000,  0.5207,  2.0049],
        [ 0.0000,  0.0000,  0.6602],
        [ 0.0000,  0.0000,  0.0000]])
>>> torch.triu(a, diagonal=-1)
tensor([[ 0.2309,  0.5207,  2.0049],
        [ 0.2072, -1.0680,  0.6602],
        [ 0.0000, -0.5211, -0.4573]])
class MAB(nn.Module): def __init__(self, dim_q: int, dim_k: int, dim_v: int, num_heads: int, layer_norm: bool = False): super().__init__() self.dim_q = dim_q #256 self.dim_k = dim_k #256 self.dim_v = dim_v #256 self.num_heads = num_heads #8 self.fc_q = nn.Linear(dim_q, dim_k) # 256->256 self.layer_k = nn.Linear(dim_k, dim_v) # 256->256 self.layer_v = nn.Linear(dim_v, dim_v) # 256->256 self.layer_norm = layer_norm if self.layer_norm: self.ln0 = nn.LayerNorm(dim_v) self.ln1 = nn.LayerNorm(dim_v) self.fc_out = nn.Linear(dim_v, dim_v) # 256->256 def reset_parameters(self): self.fc_q.reset_parameters() self.layer_k.reset_parameters() self.layer_v.reset_parameters() if self.layer_norm: self.ln0.reset_parameters() self.ln1.reset_parameters() self.fc_out.reset_parameters() pass def forward(self, q: Tensor, k: Tensor,mask: Optional[Tensor] = None)->Tensor: q = self.fc_q(q) #L*256 #print("q",q) #print("q",q.shape) k = k.unsqueeze(0) k,v = self.layer_k(k) , self.layer_v(k) #print("k",k) #print("k",k.shape) dim_split = self.dim_v // self.num_heads q_ = torch.cat(q.split(dim_split, 2), dim=0) k_ = torch.cat(k.split(dim_split, 2), dim=0) v_ = torch.cat(v.split(dim_split, 2), dim=0) #print("q_",q_.shape) #print("k_",k_.shape) #注意力计算 if mask is not None: mask = torch.cat([mask for _ in range(self.num_heads)], 0) attention_score = q_.bmm(k_.transpose(1, 2)) attention_score = attention_score / math.sqrt(self.dim_v) A = torch.softmax(mask + attention_score, -1) else: A = torch.softmax( q_.bmm(k_.transpose(1, 2)) / math.sqrt(self.dim_v), -1) #print("A",
03-22
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值