Retrieval-Augmented Generation (RAG) is a common building block of AI software engineering, allowing for more context-aware responses from Language Models. Many startups, like Wordsmith AI, are building their own RAG pipelines to enhance their AI applications with domain-specific knowledge. RAG pipelines enable Language Models to provide more accurate and authoritative answers by incorporating additional information sources.

Quote

The most obvious solution is to input the additional information via a prompt; for example, by prompting “Using the following information: [input a bunch of data] please answer the question of [ask your question].” (View Highlight)

Link to original

Quote

Here are the steps to building a RAG pipeline: Step 1: Take an inbound query and deconstruct it into relevant concepts Step 2: Collect similar concepts from your data store Step 3: Recombine these concepts with your original query to build a more relevant, authoritative answer. (View Highlight)

Link to original

RAG for PDF’s (as of oct 2024)

  • don’t use OCR+LLM
  • for retrievers: use ColQwen with BM25
  • for qna or summarizer use vision language models

references