oxford-deepNLP_biji

这篇博客涵盖了从词级语义到循环神经网络语言模型的深度学习自然语言处理主题,包括Word2Vec、N-Gram模型、RNN LSTM、GRU、文本分类和注意力机制在语言建模中的应用。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

L2a Word Level Semantics

( Word2Vec == PMI matrix factorization of count based models)

Count-based methods

Neural Embedding Models: C&W

在这里插入图片描述
Embed all words in a sentence with E、Shallow convolution over embeddings、Minimise hinge loss
在这里插入图片描述

Neural Embedding Models: CBow

在这里插入图片描述
Embed context words、 Add them、Minimize Negative Log Likelihood、

Neural Embedding

在这里插入图片描述
Target word predicts context word、Embed target word

Task-based Embedding Learning

directly train embeddings jointly with the parameters of the network which uses them
Embeddings matrix can be learned from scratch, or initialised with pre-learned embeddings(fine-tuning)

Applications

  • Text categorisation
  • Natural language generation( language modeling \ conditional language modeling)
  • Natural language understanding(
    • Translation
    • summarisation
    • conversational agents
    • Question answering
    • structured knowledge-base population
    • Dialogue)

L3 Language Modeling and RNNs I

Count based N-Gram Language Models

approximate the history with just the previous n words

Neural N-Gram Language Models

在这里插入图片描述
embed the same fixed n-gram history in a continues space(Feed forward network, h层之间没有关系,反向传播独立进行,可以并行化 Note that calculating the gradients for each time step n is independent of all other timesteps, as such they are calculated in parallel and summed)
在这里插入图片描述

Recurrent Neural Network Language Models

在这里插入图片描述

在这里插入图片描述
compress the entire history in a fixed length vector,enabling long range correlations to be captured(Recurrent Network,h层之间有时序关系,Back Propagation Through Time, Truncated Back Propagation Through Time== break depdencies after a fixed number of timesteps)
在这里插入图片描述

Bias vs Variance in LM Approximations

  • N-gram are biased but low variance
  • RNNs decrease the biase considerably, hopefully at a small cost to variance.

L4 Language Modeling and RNNs II

LSTM
GRU

L5 Text Classification

Binary classification

Multi-class classification

Multi-label classification

Clustering

Naive Bayes classifier (generative model)

Logistic Regression

RNN Classifier

  • Dual Objective RNN (combine an LM objective with classifier training and to optimise the two losses jointly)
  • Bi-Directional RNNs
  • RNN classsifier can be a generative or discriminative model either(Joint-model: generative. learns both P© and P(d))
  • Recursive Neural Networks

L6 RNNs and GPUs

L7 Conditional Language Modeling

L8 Conditional Language Modeling with Attention

L9 Speech Recognition

L10 Text to Speech

L11 Question Answering

L12 Memory Lecture

L13 Linguistics

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值