推荐系统 | 学习笔记:Field-aware Factorization Machines for CTR Prediction

本文介绍了Field-aware Factorization Machines(FFMs)在点击率预测中的应用,通过对比实验展示了FFMs相较于Poly2和FM的优势。FFMs利用字段信息提升模型效果,对于特定分类问题非常有用。实验部分详细讨论了参数影响、早停策略以及并行化实现,证实FFMs在某些数据集上表现出色,但也存在对数值特征处理的挑战。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

ABSTRACT

  • First, we propose efficient implementations for training
    FFMs.
  • Then we comprehensively analyze FFMs and compare
    this approach with competing models. Experiments
    show that FFMs are very useful for certain classification
    problems.
  • Finally, we have released a package of FFMs for
    public use.

1. INTRODUCTION

Code used for experiments in this paper and the package LIBFFM are respectively available at:
http://www.csie.ntu.edu.tw/˜cjlin/ffm/exps
http://www.csie.ntu.edu.tw/˜cjlin/libffm


2. POLY2 AND FM

FMs can be better than Poly2 when the data set is sparse


3. FFM

  • In FMs, every feature has only one latent vector to learn the latent effect with any other features, however, in FFMs, each feature has several latent vectors.

  • 在这里插入图片描述

  • usually,
    k F F M &lt; &lt; k F M k_{FFM} &lt;&lt; k_{FM} kFFM<<kFM


3.1 Solving the Optimization Problem

在这里插入图片描述


3.2 Parallelization on Shared-memory Systems

In Section 4.4 we run extensive experiments to investigate the effectiveness of parallelization.


3.3 Adding Field Information

在这里插入图片描述

Categorical Features

在这里插入图片描述

Numerical Features

在这里插入图片描述

Single-field Features

在这里插入图片描述


4. EXPERIMENTS

  • we first provide the details about the experimental setting in Section 4.1.
  • Then, we investigate the impact of parameters.
  • in Section 4.3, we discuss this issue(FFM is sensitive to the number of epochs) in detail before proposing an early stopping trick.
  • The speedup of parallelization is studied in Section 4.4
  • in Sections 4.5-4.6, we compare FFMs with other models including Poly2
    and FMs.

4.1 Experiment Settings

Data Sets

在这里插入图片描述
在这里插入图片描述

Platform
Evaluation
Implementation
  • use SSE instructions to boost the efficiency of inner products
  • The parallelization discussed in Section 3.2 is implemented by OpenMP

4.2 Impact of Parameters

  • k does not affect the logloss much
  • If λ is too large, the model is not able to achieve a good performance. On the contrary, with a small λ, the model gets better results, but it easily over-
    fits the data.
  • if we apply a small η, FFMs will obtain its best performance slowly. with a large η, FFMs are able to quickly reduce the logloss, but then over-fitting occurs.

4.3 Early Stopping


4.4 Speedup


4.5 Comparison with LMs, Poly2, and FMs on Two CTR Competition Data Sets

  • FFMs outperform other models in terms of logloss, but it also requires
    longer training time than LMs and FMs.
  • though the logloss of LMs is worse than other models, it is significantly faster.
  • Poly2 is the slowest among all models
  • FM is a good balance between logloss and speed.

4.6 Comparison on More Data Sets

  • When a data set contains only numerical features, FFMs may not have an obvious advantage
  • If we use dummy fields, then FFMs do not out-perform FMs, a result indicating that the field information is not helpful.
  • On the other hand, if we discretize numerical features, though FFMs is the best among all models, the performance is much worse than that of using dummy fields.
  • FFMs should be effective for data sets that contain categorical features and are transformed to binary features.
  • If the transformed set is not sparse enough, FFMs seem to bring less benefit.
  • It is more difficult to apply FFMs on numerical data sets.

5. CONCLUSIONS AND FUTURE WORKS

在这里插入图片描述

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值