A Hierarchical Latent Variable Encoder-Decoder Model for

本文介绍了一种名为VHRED的层级隐变量编码-解码模型,用于改进对话生成任务中的语言质量。该模型通过引入隐变量来捕捉utterance中的随机特性,解决了传统RNNLM生成的人类语言质量不高的问题。实验结果表明,在bot应用背景下,该方法能够显著提升对话生成的效果。

本文分享的paper旨在解决语言模型生成部分存在的问题,并且以bot为应用背景进行了实验。paper的题目是A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues,作者来自蒙特利尔大学和Maluuba公司,这家公司的研究水平非常地高,arxiv上常常可以刷出高质量的paper。

通常来讲,自然语言对话都会包含两个层次的结构,一个是utterance,由语言的局部统计信息来表征其含义,一个是topic,由一些随机的特征来表征。本文的工作就是对这些utterance中存在的随机特征进行建模,从而提高语言模型生成人类语言时的质量。本文认为,类似于RNNLM这样的语言模型在生成人话质量不高的根本原因在于,没有处理好隐藏在utterance中的随机feature或者说noise,从而在生成next token(short term goal)和future tokens(long term goal)效果一般。

本文的模型Latent Variable Hierarchical Recurrent Encoder Decoder(VHRED),在生成过程中分为两步:

step 1 随机采样latent variables

step 2 生成输出序列

架构示意图见下图:

640?wx_fmt=png&tp=webp&wxfrom=5&wx_lazy=

在生成每一个utterance时,需要用到四个部分,encoder RNN、context RNN、latent variable、decoder RNN,按顺序依次输入和输出。这里的latent variable和IR中的LSI有一点异曲同工,latent表明我们说不清他们到底具体是什么,但可能是代表一种topic或者sentiment,是一种降维的表示。

实验部分,选择了bot作为应用背景,得到了不错的效果。见下图:

640?wx_fmt=png&tp=webp&wxfrom=5&wx_lazy=

本文解决的不仅仅是bot领域对话生成的问题,而是整个seq2seq框架中decoder的问题,只要涉及到decoder生成的部分都可以采用本文的思想来解决问题。latent topic是一个非常有意思的东西,在LSI、推荐系统中都有非常重要的意义,矩阵分解之后得到两个降维之后的矩阵,从一组两个维度映射到了两组两个维度,也就是多了所谓的latent topic,说不清这些topic是什么,但的确可以将相似的东西聚到了一起。本文也是用latent topic来描述隐藏在utterance中那些说不清道不明的随机noise,得到了更好的效果。


来源:paperweekly


原文链接

### Hierarchical Embedding Model for Personalized Product Search In machine learning, hierarchical embedding models aim to capture the intricate relationships between products and user preferences by organizing items within a structured hierarchy. This approach facilitates more accurate recommendations and search results tailored specifically towards individual users' needs. A hierarchical embedding model typically involves constructing embeddings that represent both product features and their positions within a category tree or other organizational structures[^1]. For personalized product searches, this means not only capturing direct attributes of each item but also understanding how these relate across different levels of abstraction—from specific brands up through broader categories like electronics or clothing. To train such models effectively: - **Data Preparation**: Collect data on user interactions with various products along with metadata describing those goods (e.g., price range, brand name). Additionally, gather information about any existing hierarchies used in categorizing merchandise. - **Model Architecture Design**: Choose an appropriate neural network architecture capable of processing multi-level inputs while maintaining computational efficiency during training sessions. Techniques from contrastive learning can be particularly useful here as they allow systems to learn meaningful representations even when labels are scarce or noisy[^3]. - **Objective Function Formulation**: Define loss functions aimed at optimizing performance metrics relevant for ranking tasks; minimizing negative log-likelihood serves well as it encourages correct predictions over incorrect ones[^4]. Here’s a simplified example using Python code snippet demonstrating part of what might go into building one aspect of this kind of system—learning embeddings based off some hypothetical dataset containing customer reviews alongside associated product IDs: ```python import torch from torch import nn class HierarchicalEmbedder(nn.Module): def __init__(self, vocab_size, embed_dim=100): super().__init__() self.embedding = nn.Embedding(vocab_size, embed_dim) def forward(self, x): return self.embedding(x) # Example usage: vocab_size = 5000 # Number of unique words/products embeddings_model = HierarchicalEmbedder(vocab_size) input_tensor = torch.LongTensor([i for i in range(10)]) # Simulated input indices output_embeddings = embeddings_model(input_tensor) print(output_embeddings.shape) # Should output something similar to "torch.Size([10, 100])" ``` This script initializes a simple PyTorch module designed to generate fixed-size vector outputs corresponding to given integer keys representing either textual tokens found within review texts or numeric identifiers assigned uniquely per catalog entry.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值