magic starSummarize by Aili

RAG vs. GAR: A primer on Generation Augmented Retrieval

๐ŸŒˆ Abstract

The article discusses the advantages and disadvantages of using Retrieval Augmented Generation (RAG) and proposes an alternative approach called Generation Augmented Retrieval (GAR) for a domain-specific search engine project.

๐Ÿ™‹ Q&A

[01] Retrieval Augmented Generation (RAG)

1. What are the disadvantages of using RAG that prevented the author from using it in a recent project?

  • The same texts get generated repeatedly, even though the existing texts in the dataset are already great
  • Conversational interfaces do not always offer the best user experience, and a list of well-designed, feature-rich, and interactive results is often more effective
  • There is a risk of "prompt hacking" and PR nightmares when launching a chatbot under your name or brand, as the LLM may generate problematic responses
  • RAG is only as good as the content fed into the chosen LLM's context, and retrieving content based on semantic similarity may not always be sufficient, especially for complex objects with many conditions and edge cases

[02] Generation Augmented Retrieval (GAR)

1. How does GAR differ from RAG?

  • In GAR, LLMs are primarily used for their reasoning capabilities rather than text generation, turning them into smart, context-aware query engines that select the content to be forwarded to the application
  • The main difference from RAG is that GAR focuses on using LLMs for their reasoning abilities rather than for text generation

2. What are the key steps in the GAR approach described in the article?

  • The user creates a query (plain text plus two drop-downs in the case described)
  • The LLM receives the plain text query together with the most essential properties (ID, summary, detailed edge cases and conditions) of up to 100 database entries (pre-filtered by the drop-downs)
  • The LLM uses its reasoning capabilities to select which entries best match the provided information
  • The LLM returns the IDs of the selected entries as JSON, along with a small quote per entry explaining why the entry is a good fit
  • The application resolves the IDs to a well-designed, feature-rich, and interactive list of results

3. What are the downsides of the GAR approach?

  • Speed: Especially when using more complex Chain-of-Thought prompting patterns, getting an answer might take a while, though this can be mitigated by using faster models
  • Limited context size: While contexts are getting larger, you still might need to filter the content you pass to the LLM
  • Less chattiness: In some situations, a bubbly chatbot is exactly what you want, though you can also show rich, interactive objects inside conversational interfaces

4. How does the author see the future of GAR?

  • With increasing inference speeds and larger context sizes, GAR will become more practical in the future
  • GAR is not a replacement for RAG but could be a good alternative when you need tight control over the outputs of an LLM without missing out on its reasoning abilities
Shared by Daniel Chen ยท
ยฉ 2024 NewMotor Inc.