magic starSummarize by Aili

Discrete Semantic Tokenization for Deep CTR Prediction

🌈 Abstract

The paper introduces a new semantic-token paradigm for click-through rate (CTR) prediction models, which aims to efficiently incorporate item content information while maintaining time and space efficiency. The proposed approach, called User-Item Semantic Tokenization (UIST), converts user sequences and item content into discrete tokens, providing a substantial memory compression compared to existing embedding-based approaches.

🙋 Q&A

[01] User–Item Semantic Tokenization

1. What are the key components of the UIST framework?

  • UIST comprises three main modules: two semantic tokenizers (for items and users) and a hierarchical mixture inference (HMI) module.
  • The semantic tokenizers transform dense and high-dimensional item and user embeddings into discrete tokens, achieving a significant memory compression.
  • The HMI module dynamically adjusts the significance of various levels of granularity for user-item interactions to enhance the integration of hierarchical item and user tokens.

2. How does the discrete semantic tokenization work?

  • The semantic tokenization process has two stages:
    1. Semantic Representation: An autoencoder network is used to learn the contextual knowledge of the input sequence (item content or user behavior) and obtain a unified representation.
    2. Discrete Tokenization: The dense sequence representation is discretized into concise tokens using a residual quantization technique (RQ-VAE).

3. What is the purpose of the hierarchical mixture inference (HMI) module?

  • The HMI module is designed to effectively utilize user-item pairs at different levels of granularity in the click-through rate prediction task.
  • It analyzes the contribution of each user-item token pair by constructing coarse-to-fine item and user embeddings based on the hierarchical tokens and using a deep CTR model to predict click scores.
  • The module then employs a linear layer to automatically weigh these scores and compute the final click probability.

[02] Experiments and Evaluation

1. How did the authors evaluate the effectiveness of UIST?

  • The authors conducted offline experiments on a real-world news recommendation dataset (MIND), comparing UIST against three modern deep CTR models: DCN, DeepFM, and FinalMLP.
  • They evaluated the recommendation effectiveness using AUC and nDCG metrics, and also measured the inference time (latency) of each baseline.

2. What were the key findings from the experiments?

  • The content-based paradigm exhibited unacceptable latencies (over 60ms) for industrial scenarios.
  • The single-layered ID-based and embedding-based approaches had similar latency, but the embedding-based approaches performed better due to the use of content-based item representation.
  • The proposed IST (item-only semantic tokenization) and UIST achieved substantial memory compression (around 200 times) compared to other paradigms, while maintaining up to 99% (IST) and 98% (UIST) accuracy compared to the state-of-the-art embedding-based paradigm.
  • The hierarchical mixture inference (HMI) module outperformed simpler aggregation mechanisms for dual tokens.

3. What are the key advantages of the semantic-based approach (UIST) compared to other paradigms?

  • UIST provides a streamlined approach to integrating item content into deep CTR models, offering significant improvements in efficiency, particularly in industrial scenarios.
  • The substantial memory compression achieved by UIST (around 200-fold) makes it a promising solution for applications that require both time and space efficiency, such as dataset compression.
Shared by Daniel Chen ·
© 2024 NewMotor Inc.