Summarize by Aili

Discrete Semantic Tokenization for Deep CTR Prediction

🌈 Abstract

The paper introduces a new semantic-token paradigm for click-through rate (CTR) prediction models, which aims to efficiently incorporate item content information while maintaining time and space efficiency. The proposed approach, called User-Item Semantic Tokenization (UIST), converts user sequences and item content into discrete tokens, providing a substantial memory compression compared to existing embedding-based approaches.

🙋 Q&A

[01] User–Item Semantic Tokenization

1. What are the key components of the UIST framework?

UIST comprises three main modules: two semantic tokenizers (for items and users) and a hierarchical mixture inference (HMI) module.
The semantic tokenizers transform dense and high-dimensional item and user embeddings into discrete tokens, achieving a significant memory compression.
The HMI module dynamically adjusts the significance of various levels of granularity for user-item interactions to enhance the integration of hierarchical item and user tokens.

2. How does the discrete semantic tokenization work?

The semantic tokenization process has two stages:
1. Semantic Representation: An autoencoder network is used to learn the contextual knowledge of the input sequence (item content or user behavior) and obtain a unified representation.
2. Discrete Tokenization: The dense sequence representation is discretized into concise tokens using a residual quantization technique (RQ-VAE).

3. What is the purpose of the hierarchical mixture inference (HMI) module?

The HMI module is designed to effectively utilize user-item pairs at different levels of granularity in the click-through rate prediction task.
It analyzes the contribution of each user-item token pair by constructing coarse-to-fine item and user embeddings based on the hierarchical tokens and using a deep CTR model to predict click scores.
The module then employs a linear layer to automatically weigh these scores and compute the final click probability.

[02] Experiments and Evaluation

1. How did the authors evaluate the effectiveness of UIST?

The authors conducted offline experiments on a real-world news recommendation dataset (MIND), comparing UIST against three modern deep CTR models: DCN, DeepFM, and FinalMLP.
They evaluated the recommendation effectiveness using AUC and nDCG metrics, and also measured the inference time (latency) of each baseline.

2. What were the key findings from the experiments?

The content-based paradigm exhibited unacceptable latencies (over 60ms) for industrial scenarios.
The single-layered ID-based and embedding-based approaches had similar latency, but the embedding-based approaches performed better due to the use of content-based item representation.
The proposed IST (item-only semantic tokenization) and UIST achieved substantial memory compression (around 200 times) compared to other paradigms, while maintaining up to 99% (IST) and 98% (UIST) accuracy compared to the state-of-the-art embedding-based paradigm.
The hierarchical mixture inference (HMI) module outperformed simpler aggregation mechanisms for dual tokens.

3. What are the key advantages of the semantic-based approach (UIST) compared to other paradigms?

UIST provides a streamlined approach to integrating item content into deep CTR models, offering significant improvements in efficiency, particularly in industrial scenarios.
The substantial memory compression achieved by UIST (around 200-fold) makes it a promising solution for applications that require both time and space efficiency, such as dataset compression.

Shared by Daniel Chen ·

Install fromChrome Web Store