Summarize by Aili

Semantic Chunking for RAG

https://medium.com/the-ai-forum/semantic-chunking-for-rag-f4733025d5f5

🌈 Abstract

The article discusses the concept of "Semantic Chunking" for Retrieval Augmented Generation (RAG) models. It explores different chunking strategies, including fixed-size chunking, recursive chunking, document-specific chunking, semantic chunking, and agentic chunking. The focus is on experimenting with semantic chunking and recursive retriever approaches.

🙋 Q&A

[01] What is Chunking?

Chunking refers to the process of breaking down text into smaller parts or pieces to abide by the context window of the Large Language Model (LLM).

[02] What is RAG?

Retrieval Augmented Generation (RAG) is a technique introduced to address the problem of hallucination in LLMs, where the models confidently generate wrong answers. RAG involves encoding textual documents into vector embeddings and storing them in a vector store. The encoded chunks are then retrieved and used to augment the generation process.

[03] What are the different chunking methods discussed?

Fixed Size Chunking: A straightforward approach where the text is divided into chunks of a fixed number of tokens, with optional overlap between chunks.
Recursive Chunking: An iterative approach that divides the input text into smaller chunks using a set of separators, recursively calling the process until the desired chunk size or structure is achieved.
Document Specific Chunking: An approach that considers the structure of the document and creates chunks that align with the logical sections, such as paragraphs or subsections.
Semantic Chunking: An approach that divides the text into meaningful, semantically complete chunks by considering the relationships within the text.
Agentic Chunking: An approach that processes documents in a way that mimics how humans would, starting at the top and deciding whether a new sentence or piece of information belongs to the current chunk or should start a new one.

[04] How is Semantic Chunking implemented?

Semantic chunking involves:
- Splitting the document into sentences based on separators (., ?, !)
- Indexing each sentence based on position
- Grouping sentences based on similarity of their embeddings
- Merging groups of similar sentences and splitting dissimilar sentences

[05] What is the comparison between Semantic Chunking and Naive Chunking?

The article compares the performance of Semantic Chunking and Naive Chunking (using RecursiveCharacterTextSplitter) using the RAGAS evaluation framework.
The results show that Semantic Chunking and Naive Chunking have similar performance, with Naive Chunking having a slightly better score for factual representation of the answer.

[06] What is the purpose of Ablation Studies?

Ablation studies are used to understand the impact of different components or settings of a machine learning model on its performance. In the context of the article, ablation studies are used to evaluate the effect of the number of training steps and masking procedures on the performance of the BERT model.

Shared by Daniel Chen ·

Install fromChrome Web Store