Efficiently Leveraging Multi-level User Intent for SBR via Atten-Mixer Network(WSDM23)

很有趣的一篇文章。这篇文章的结论与GNN的主流研究趋势相反:主流是在设计更强大的GNN以捕捉项目之间的复杂转换,而本文认为,GNN的能力已经过剩了,完全不需要设计多复杂的GNN模型,本文甚至去掉了GNN结构,用多层的注意力机制(Atten-Mixer)拟合GNN的高级连接。(当然场景是Session-based推荐,序列的平均长度很短)

0 摘要

基于会话的推荐(SBR)旨在基于短的和动态的会话来预测用户的下一步行动。最近,人们越来越关注利用各种精心设计的图神经网络 (GNN) 来捕获项目之间的成对关系,这似乎表明设计更复杂的模型是提高实证性能的灵丹妙药。然而,虽然模型复杂性的指数增长,但这些模型只能实现相对边际的改进。

因此,本文建议直接去掉GNN的传播部分,在readout模块增强其推理能力。Atten-Mixer提出了多级注意混合网络,它利用概念视图和实例视图读数来实现项目转换的多级推理。

1 引言

最近的 SBR 研究发现,基于 GNN 的模型的使用激增,以更好地捕捉项目的复杂转换。然而,与模型复杂度的指数增长相比,每个模型在基准测试中带来的性能提升微乎其微(详见表 2)。鉴于这种现象,自然会出现一个有意义的问题:那些基于 GNN 的模型对于 SBR 来说是不够复杂还是过于复杂?为了回答这个问题,我们剖析了现有的基于 GNN 的 SBR 模型,并凭经验发现一些 GNN 传播似乎是多余的,因为读出模块在这些模型中起着重要作用。

这种观察结果与今天的趋势非常相反,SBR 社区寻求更强大的 GNN 设计来捕捉项目之间的复杂转换。与其他推荐领域相比,由于会话数据固有的短和动态特性,会话图更加稀疏。例如,Diginetica 数据集中近 70% 的会话由不同的项目组成,这意味着基于会话数据构建图形可能只会产生一个序列。在这种情况下,GNN 中的一些设计相当繁重,与读出模块设法从数据中学习的总体偏好相比,其贡献很小。因此,我们假设读出模块的高级架构设计将受益更多。随着我们放宽对 GNN 传播部分的要求,读出模块应该在模型推理过程中承担更多责任。因此,需要一种具有强大推理能力的readout模块。

现有的基于readout模块的改进集中于实例视角(instance-view,即下图中的第一层,若

### Sparse Variational Gaussian Process Implementation and Application in Multi-output Regression In the context of multi-output regression using sparse variational Gaussian processes (SVGP), these models aim to efficiently handle large datasets while maintaining predictive performance. The key idea is to approximate a full Gaussian process with a smaller set of inducing points that summarize the data distribution effectively. The formulation for SVGP involves introducing pseudo-inputs or inducing variables \( \mathbf{u} \). These are strategically chosen locations where function values can be evaluated, thereby reducing computational complexity from O() to approximately O(M²N)[^1], making it feasible to apply GPs on larger datasets. For implementing an SVGP model tailored towards multi-output scenarios: #### Model Definition A common approach employs independent latent functions per output dimension but shares some parameters across them. This setup allows capturing correlations between outputs through shared kernels or other mechanisms like coregionalization matrices. ```python import gpflow class MultiOutputSVGPR(gpflow.models.SVGP): def __init__(self, kernel_list, likelihood, Z, num_latent_gps=1): super().__init__( kernel=gpflow.kernels.SharedIndependent(kernels=kernel_list), likelihood=likelihood, inducing_variable=Z, q_diag=False, whiten=True, num_latent_gps=num_latent_gps ) ``` This code snippet defines a custom class inheriting from `gpflow.models.SVGP` which supports multiple outputs by specifying different kernels within `kernel_list`. #### Training Procedure Training such models typically relies on stochastic optimization techniques due to their scalability properties when dealing with big data sets. One popular method mentioned previously includes Bayesian posterior sampling via stochastic gradient Fisher scoring[^3]. However, more commonly used algorithms involve Adam optimizer combined with mini-batch training strategies. #### Practical Considerations When applying SVGPs for real-world problems involving high-dimensional inputs/output spaces, several factors should be considered: - Selection criteria for choosing optimal number M of inducing points based on trade-offs between accuracy vs speed; - Efficient selection methods for placing those inducing points either randomly sampled from dataset or optimized positions learned during inference phase; - Choice of appropriate covariance structures capable of modeling complex dependencies among various dimensions without overfitting issues;
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值