Summarize by Aili

What Nobody Tells You About RAGs

https://towardsdatascience.com/what-nobody-tells-you-about-rags-b35f017e1570

🌈 Abstract

A deep dive into the challenges and best practices of building a Retrieval Augmented Generation (RAG) system for real-world business scenarios, covering the business value, data handling, and technical optimizations required.

🙋 Q&A

[01] Clarify the business value from the start: the context, the users, and the data

1. What are some key business requirements to consider before starting a RAG/LLM-based project?

Clarify the context and understand the users' main business issues that the RAG system can help address
Educate non-technical users on the capabilities and limitations of generative AI
Understand the user journey and how the RAG system will integrate into existing workflows
Anticipate the data to be indexed, qualify it, and map it to the users' needs
Define clear success criteria and metrics to evaluate the project's ROI

[02] Understand what you're indexing

1. What are the different data modalities that can be indexed in a RAG system?

Text data
Images and diagrams
Tables
Code snippets

2. How can these multimodal data sources be combined in a RAG system?

Text data is chunked and embedded using a text embedding model
Tables are summarized with an LLM, and their descriptions are embedded and used for indexing
Code snippets are chunked and embedded using a text embedding model
Images are converted into embeddings using a multimodal vision and language model

[03] Improve chunk quality — garbage in, garbage out

1. What are some tips for improving the quality of text chunks in a RAG system?

Leverage document metadata like table of contents, titles, or headers to provide contextually relevant chunks
Adjust chunk size based on the characteristics of the data (e.g., longer chunks for wordy documents, shorter chunks for bullet-point style)
Explore semantic chunking techniques to generate chunks that are semantically relevant

[04] Improve pre-retrieval

1. What are some key pre-retrieval techniques to consider?

Query rewriting: Use an LLM to rephrase the user's query to improve clarity and specificity
Query expansion with Hypothetical Document Embedding (HyDE): Generate a hypothetical answer and use it to retrieve more relevant documents
Query augmentation: Combine the original query with the preliminary generated outputs to retrieve more relevant information

[05] Improve retrieval

1. What are some techniques for improving the retrieval step in a RAG system?

Hybrid search: Combine vector search and keyword search to leverage the advantages of both
Filter on metadata: Use additional metadata properties to pre-filter the vector space and improve relevance
Test multiple embedding models and fine-tune them for domain-specific data

[06] Improve post-retrieval

1. What are some post-retrieval techniques to increase the relevancy of the retrieved documents?

Reranking: Re-order the retrieved documents based on their alignment with the query
Remove irrelevant chunks: Use an LLM to filter out unimportant sections or chunks from the retrieved documents

[07] An overlooked part: Generation

1. What are some tips for enhancing the answer generation step in a RAG system?

Define a system prompt to guide the LLM's behavior and writing style
Include few-shot examples in the system prompt to provide context for complex tasks
Force the LLM to generate structured outputs when appropriate
Leverage techniques like Chain of Thought to improve reasoning and summarization in the generated answers

Shared by Daniel Chen ·

Install fromChrome Web Store