Why Memory Matters?(记忆力为何如此重要?)

What is memory? The general consensus is that memory is a multitude of cognitive systems which allow us to store information for certain periods of time so that we can learn from our past experiences and predict the future.
什么是记忆?人们普遍的共识是,记忆是多种认知系统,允许我们把信息存储一段时间,进而可以汲取过去的经验,预测未来的发展。

Memory impacts every facet of our lives.The first step to remembering things better is to understand how your memory works.
记忆影响着我们生活的各个方面。更好地记忆事物的第一步,是理解你的记忆系统如何工作。
在这里插入图片描述
There are two basic kinds of memory retrospective and prospective.Whereas retrospective memory is about remembering what happened in the past, prospective memory is about reminding yourself to do something in the future.Without propective memory, you would not remember to go to work in the morning and you would forget to set your alarm clock in the evening.
有两类基本的记忆–回溯记忆和前瞻记忆。回溯记忆是要记起过去发生了什么,而前瞻记忆是提醒你在未来要做某件事情。要是没有前瞻记

### Key Components and Concepts in Transformers Architecture Transformers rely heavily on self-attention mechanisms, which constitute a critical part of their design[^1]. Self-attention allows models to weigh the significance of different words within a sentence relative to every other word. This mechanism enables more effective processing of sequential data without being constrained by fixed-length context windows. The architecture also incorporates multi-head attention layers that permit the model to jointly attend to information from different representation subspaces at various positions. Each head learns distinct patterns leading to richer representations overall. Positional encodings are added to the input embeddings since the self-attention layer does not inherently capture positional relationships between tokens. These encodings provide necessary ordering information about sequences so that the network can understand where each element appears in relation to others. Normalization techniques such as Layer Normalization play an essential role too; they stabilize training dynamics across multiple stacked transformer blocks ensuring consistent performance throughout deep networks. Feed-forward neural networks follow after normalization steps providing non-linearity required for learning complex mappings between inputs and outputs. ```python import torch.nn as nn class TransformerBlock(nn.Module): def __init__(self, embed_size, heads, dropout, forward_expansion): super(TransformerBlock, self).__init__() self.attention = MultiHeadAttention(embed_size, heads) self.norm1 = nn.LayerNorm(embed_size) self.norm2 = nn.LayerNorm(embed_size) self.feed_forward = nn.Sequential( nn.Linear(embed_size, forward_expansion * embed_size), nn.ReLU(), nn.Linear(forward_expansion * embed_size, embed_size), ) self.dropout = nn.Dropout(dropout) def forward(self, value, key, query, mask): attention = self.attention(value, key, query, mask) # Add & Norm x = self.dropout(self.norm1(attention + query)) forward = self.feed_forward(x) out = self.dropout(self.norm2(forward + x)) return out ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值