
NLP 核心推导
文章平均质量分 96
介绍主流自然语言处理的算法思路分析及数学推导
Jay_Tang
小唐的 ML & NLP 进阶之路
展开
-
往期文章集合目录
Logistic Regression, L1, L2 regularization, Gradient/Coordinate descent 详细MLE v.s. MAP ~ L1, L2 Math Derivation 详细XGBoost math Derivation 通俗易懂的详细推导Introduction to Convex Optimization Basic Concept...原创 2020-04-14 00:44:41 · 1733 阅读 · 0 评论 -
Relation Extraction 关系抽取综述
文章目录往期文章链接目录Information Extraction v.s. Relation ExtractionExisting Works of REPattern-based MethodsStatistical Relation Extraction ModelsNeural Relation Extraction MethodsFuture DirectionsUtilizing More DataMethods to Denoise DS DataOpen Problem for Utili原创 2021-01-03 12:32:42 · 2102 阅读 · 0 评论 -
Connecting the Dots: Document-level Neural Relation Extraction with Edge-oriented Graphs 关系抽取论文总结
文章目录往期文章目录链接Relation Extraction (RE)document-level REIntuitionContributionOverview of Proposed ModelProposed ModelSentence Encoding LayerGraph construction LayerNode ConstructionEdge ConstructionInference LayerFirst StepSecond StepClassification LayerResul原创 2020-12-31 08:47:50 · 1232 阅读 · 0 评论 -
跨语言学习归纳总结 Cross-Lingual Learning paper summary
往期文章链接目录文章目录往期文章链接目录Cross-lingual learningCross-lingual resourcesMultilingual distributional representationsEvaluation of multilingual distributional representationsParallel corpusWord AlignmentsMachine TranslationUniversal features (out of fashion)Biling原创 2020-12-28 14:14:15 · 2152 阅读 · 0 评论 -
BERT and RoBERTa 知识点整理
往期文章链接目录文章目录往期文章链接目录BERT RecapOverviewBERT SpecificsThere are two steps to the BERT framework: pre-training and fine-tuningInput Output RepresentationsTasksresultsAblation studiesEffect of Pre-training TasksEffect of Model SizesReplication study of BERT p原创 2020-09-18 12:11:09 · 1954 阅读 · 1 评论 -
What Does BERT Look At? An Analysis of BERT’s Attention 论文总结
文章目录往期文章链接目录Before we startSurface-Level Patterns in AttentionProbing Individual Attention HeadsProbing Attention Head CombinationsClustering Attention Heads往期文章链接目录往期文章链接目录Before we startIn this post, I mainly focus on the conclusions the authors reach原创 2020-09-14 09:38:27 · 2294 阅读 · 0 评论 -
常见多语言模型详解 (M-Bert, LASER, MultiFiT, XLM)
文章目录往期文章链接目录Ways of tokenizationWord-based tokenizationCharacter-based tokenizationSubword tokenizationExisting approaches for cross-lingual NLPOut-of-vocabulary (OOV) problem in mono/multi-lingual settingsM-BERT (Multi-lingual BERT)WHY MULTILINGUAL BERT W原创 2020-08-08 07:45:48 · 7815 阅读 · 0 评论 -
RNN, LSTM 图文详解
文章目录往期文章链接目录Sequence DataWhy not use a standard neural network for sequence tasksRNNDifferent types of RNNsLoss function of RNNBackpropagation through timeVanishing gradients with RNNsAdvantages and Drawbacks of RNNLSTMTypes of gatesformulas and illustrati原创 2020-06-04 11:44:02 · 1630 阅读 · 0 评论 -
Log-Linear Model & CRF 条件随机场详解
文章目录往期文章链接目录Log-Linear modelConditional Random Fields (CRF)Formal definition of CRFLog-linear model to linear-CRFInference problem for CRFLearning problem for CRFLearning problem for general Log-Linear modelLearning problem for CRFCompute Z(xˉ,w)Z(\bar x,原创 2020-05-19 13:15:11 · 855 阅读 · 0 评论 -
Hidden Markov Model (HMM) 详细推导及思路分析
往期文章链接目录Before reading this post, you should be familiar with the EM Algorithm and decent among of knowledge of convex optimization. If not, check out my previous postEM Algorithmconvex optimiz...原创 2020-05-03 03:32:13 · 1678 阅读 · 1 评论 -
Probabilistic Graphical Model (PGM) 概率图模型框架详解
往期文章链接目录Probabilistic Graphical Model (PGM)Definition: A probabilistic graphical model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables.In general, PGM obeys following rules:Sum Rul原创 2020-05-11 02:50:37 · 3147 阅读 · 2 评论 -
干货: Skip-gram 详细推导加分析
往期文章链接目录Comparison between CBOW and Skip-gramThe major difference is that skip-gram is better for infrequent words than CBOW in word2vec. For simplicity, suppose there is a sentence “w1w2w3w4w_1w_2...原创 2020-04-17 11:57:36 · 2230 阅读 · 0 评论 -
Distributed representation, Hyperbolic Space, Gaussian/Graph Embedding 详细介绍
往期文章链接汇总Overview of various word representation and Embedding methodsLocal Representation v.s. Distributed RepresentationOne-hot encoding is local representation and is good for local generalizati...原创 2020-04-17 11:43:54 · 1380 阅读 · 0 评论 -
NLP基础概览 + Spell Correction with Noisy Channel
NLP = NLU + NLGNLU: Natural Language UnderstandingNLG: Natural Language GenerationNLG may be viewed as the opposite of NLU: whereas in NLU, the system needs to disambiguate the input sentence to ...原创 2020-04-10 12:32:07 · 1700 阅读 · 0 评论