Awesome Retrieval‑Augmented Generation (RAG)
Proudly sponsored by CustomGPT.ai • Join the Slack community
CustomGPT.ai, no-code platform for building enterprise-grade RAG applications. Citation-backed answers, no hallucinations. With SOC-2 Type II security, GDPR compliance, and support for over 1400 document formats and 92 languages.
Retrieval‑Augmented Generation (RAG) equips language models with fresh, domain‑specific knowledge by fetching external context at inference time. This list is a one‑stop catalogue of every major RAG‑related resource—tools, papers, benchmarks, tutorials, and more.
Only very short descriptions are provided when essential for clarity. PRs welcome!
Table of Contents
- Open Source Tools
- Embedding Models & Libraries
- Proprietary Tools
- Vendor Examples
- Vector DBs & Search Engines
- Research Papers and Surveys
- RAG Approaches and Architectures
- Frameworks
- RAG Techniques and Methodologies
- Retrieval Methods
- Prompting Strategies
- Chunking & Pre‑processing
- Embeddings Models
- Instruction Tuning & Optimization
- Finetuning and Training
- Response Quality, and Hallucination
- Security and Privacy Considerations
- Evaluation Metrics and Benchmarks
- Advantages and Disadvantages
- Performance, Cost & Observability
- RAG Fine-tuning
- Knowledge‑Graph / Structured RAG
- Libraries and SDKs
- Key Concepts
- Educational Content
- Influential Researchers and Influencers
- Latest Trends 2024-2025
- Community Resources
Open Source Tools
- CustomGPT.ai - Open-source SDK for building custom RAG applications with enterprise-grade features
- TrustGraph - Open-source enterprise-grade complete AI solution stack for data sovereignty
- RAGFlow - Open-source RAG engine based on deep document understanding
- R2R (RAG to Riches) - Advanced AI retrieval system with production-ready features
- FastRAG - Research framework for efficient retrieval augmented generation
- FlashRAG - Python toolkit for RAG research with 36+ datasets and 17+ algorithms
- Verba - Open-source RAG application out of the box
- Kotaemon - Clean, customizable RAG UI for document-based Q&A
- Cognita - Open-source RAG framework for modular applications
- GraphRAG - Microsoft's approach to RAG using knowledge graphs
- Nano-GraphRAG - Compact GraphRAG solution with core capabilities
- LangChain — Python/JS agents & chains
- LangChain4j — JVM
- LlamaIndex — Data loaders & indices
- Haystack — Modular pipelines
- Semantic Kernel — .NET & Python
- DSPy — Declarative pipelines
- Guidance — Prompt DSL
- Flowise — No‑code builder
- reag — Reasoning Augmented Generation
- Danswer — Internal Q&A search
- Neum — Creation and synchronization of vector embeddings at large scale
- GPTCache — Embedding‑aware cache
- Mastra The TypeScript AI agent framework. Assistants, RAG, observability. Supports any LLM: GPT-4, Claude, Gemini, Llama
- Letta (MemGPT) — Stateful apps
- Swiftide - Fast, streaming indexing, query, and agentic LLM applications in Rust
- LangGraph — Agentic DAGs
- Ragna — RAG orchestration framework
- SimplyRetrieve - Lightweight chat AI platform featuring custom knowledge.
Embedding Models & Libraries
- OpenAI text‑embedding‑3
- Cohere Embed v3
- FlagEmbedding
- Jina Embeddings v4
- E5/GTE (MTEB)
- SentenceTransformers
- MiniLM
- ColBERT v2
- Voyage AI Embeddings
- BGE family
- Nomic Embed Text
- fastText — char n‑gram baseline
Proprietary Tools
- CustomGPT.ai RAG API — Enterprise agents, hallucination free.
- Pinecone - Fully managed vector database service
- LangSmith - Platform for building and evaluating LLM applications
- OpenAI Assistants & Retrieval
- Vectara — GenAI API
- Cohere RAG
- AWS Knowledge Bases for Bedrock
- Azure AI Search + RAG
- Google Vertex AI Search & RAG
- IBM watsonx.ai Retrieval
- NVIDIA NeMo Retriever
- Anthropic Claude Retrieval
- Databricks DBRX RAG
- Elastic Search Labs RAG blueprints
Vendor Examples
- Amazon Kendra - Intelligent enterprise search with RAG
- Amazon Bedrock Knowledge Bases
- Azure AI Search
- Google Vertex AI Search
- LangChain × OpenAI Quickstart
- LangChain × Elasticsearch Blueprint
- LlamaIndex × Vespa Guide
- Qdrant Hybrid Search miniCOIL
- AWS Bedrock RAG Sample
- Azure RAG Jumpstart
- GCP Vertex RAG Agent Builder
Other Tools
- LangFuse: Open-source tool for tracking LLM metrics, observability, and prompt management.
- Ragas: Framework that helps evaluate RAG pipelines.
- LangSmith: A platform for building production-grade LLM applications, allows you to closely monitor and evaluate your application.
- Hugging Face Evaluate: Tool for computing metrics like BLEU and ROUGE to assess text quality.
- Weights & Biases: Tracks experiments, logs metrics, and visualizes performance.
Vector DBs & Search Engines
Pick a vector db - GUIDE
- Weaviate - Open-source vector database with GraphQL interface
- Qdrant - High-performance vector similarity search engine
- Milvus - Open-source vector database for scalable similarity search
- Chroma - Open-source embedding database for LLM applications
- Pinecone - The vector database
- Elasticsearch (vector) - distributed search and analytics engine
- OpenSearch - Open source distributed and RESTful search engine
- Vespa - AI + Data, online
- PGVector - PostgreSQL extension for vector similarity search
- Redis Stack Search - Searching and querying Redis data using the Redis Query Engine
- ClickHouse Vectors
- Oracle AI Vector Search
- TiDB Vector - semantic similarity searches across various data types
- ScaNN - ScaNN (Scalable Nearest Neighbors) is a method for efficient vector similarity search at scale
- Lantern.dev - open-source Postgres vector database
- Azure Cosmos DB: Globally distributed, multi-model database service with integrated vector search.
- Couchbase: A distributed NoSQL cloud database.
- LlamaIndex: Employs a straightforward in-memory vector store for rapid experimentation.
- Neo4j: Graph database management system.
- Redis Stack: An in-memory data structure store used as a database, cache, and message broker.
- SurrealDB: A scalable multi-model database optimized for time-series data.
Research Papers and Surveys
- Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks - Original RAG paper by Patrick Lewis et al.
- REALM: Retrieval-Augmented Language Model Pre-Training - Google's foundational retrieval-augmented language model
- Dense Passage Retrieval for Open-Domain Question Answering - Facebook's DPR system for dense retrieval
- Retrieval-Augmented Generation for Large Language Models: A Survey - Comprehensive survey covering Naive RAG, Advanced RAG, and Modular RAG
- A Comprehensive Survey of Retrieval-Augmented Generation (RAG) - 2024 survey tracing RAG evolution from foundational concepts to current state
- Retrieval-Augmented Generation for AI-Generated Content: A Survey - Comprehensive review of RAG techniques for AIGC scenarios
- Evaluation of Retrieval-Augmented Generation: A Survey - Comprehensive overview of RAG evaluation methodologies
- (2020)Retrieval‑Augmented Generation for Knowledge‑Intensive NLP Tasks - Lewis et al. — RAG baseline
- (2020) REALM - Guu et al. — Retriever‑augmented pre‑training
- (2022) Atlas - Izacard & Grave — Few‑shot RAG
- (2022) RETRO - Borgeaud et al. — Large‑scale retrieval cache
- (2024) Benchmarking LLMs in RAG - Chen et al.
- (2024) Reliable, Adaptable & Attributable LMs with Retrieval - Dan et al.
- (2024) GraphRAG - Microsoft Research
- (2024) RAG‑Fusion - Meta
- (2025) Look‑ahead Retrieval - OpenAI
More - RAG Research Papers Collection - Curated list from ICML, ICLR, ACL
RAG Survey 2022
RAG Survey 2023
- Retrieving Multimodal Information for Augmented Generation: A Survey
- Retrieval-Augmented Generation for Large Language Models: A Survey
RAG Survey 2024
- Retrieval-Augmented Generation for AI-Generated Content: A Survey
- A Survey on Retrieval-Augmented Text Generation for Large Language Models
- RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing
- A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models
- Evaluation of Retrieval-Augmented Generation: A Survey
- Retrieval-Augmented Generation for Natural Language Processing: A Survey
- Graph Retrieval-Augmented Generation: A Survey
- Retrieval Augmented Generation (RAG) and Beyond: A Comprehensive Survey on How to Make your LLMs use External Data More Wisely
- A Comprehensive Survey of Retrieval-Augmented Generation (RAG): Evolution, Current Landscape and Future Directions
RAG Approaches and Architectures
- Fusion-in-Decoder (FiD)
- RETRO (Retrieval-Enhanced Transformer) - DeepMind's approach with trillions of tokens
- Atlas: Few-shot Learning with Retrieval Augmented Language Models - Meta's Atlas model for few-shot learning
- ColBERT: Efficient Late Interaction Retrieval - Multi-vector dense retrieval with late interaction
- Cache-Augmented Generation (CAG) – Pre-loads pertinent documents into the model’s context and retains the key-value (KV) cache from earlier inferences.
- Agentic RAG – “Retrieval agents” that autonomously decide how and when to retrieve information.
- Corrective RAG (CRAG) – Adds a refinement step to fix or polish retrieved content before it is woven into the LLM’s answer.
- Retrieval-Augmented Fine-Tuning (RAFT) – Fine-tunes language models specifically to boost both retrieval quality and generation performance.
- Self-Reflective RAG – Systems that monitor their own outputs and dynamically adjust retrieval strategies based on feedback.
- RAG Fusion – Blends multiple retrieval techniques to supply richer, more relevant context.
- Temporal Augmented Retrieval (TAR) – Incorporates time-aware signals so retrieval favors the most temporally relevant data.
- Plan-then-RAG (PlanRAG) – Creates a high-level plan first, then executes retrieval-augmented generation for complex tasks.
- GraphRAG – Leverages knowledge graphs to structure context and enhance reasoning.
- FLARE – Uses active, iterative retrieval to progressively improve answer quality.
- Contextual Retrieval – Enriches document chunks with added context before retrieval, improving relevance from large knowledge bases.
- GNN-RAG – Applies graph neural networks to retrieval for better reasoning in large-language-model workflows.
Frameworks
- LangChain - Framework for building LLM applications with chaining capabilities
- LlamaIndex - Framework for connecting custom data sources to LLMs
- Haystack - End-to-end framework for building production-ready LLM applications
- DSPy - Framework for programming language models with automatic optimization
- Dify - Open-source LLM app development platform with RAG pipeline
- Semantic Kernel - Microsoft's SDK for developing Generative AI applications
- Flowise - Drag & drop UI to build customized LLM flows
- Cognita: Open-source RAG framework for building modular and production ready applications.
- Verba: Open-source application for RAG out of the box.
- Mastra: Typescript framework for building AI applications.
- Letta: Open source framework for building stateful LLM applications.
- Swiftide: Rust framework for building modular, streaming LLM applications.
- CocoIndex: ETL framework to index data for AI, such as RAG; with realtime incremental updates.
RAG Techniques and Methodologies
- HyDE (Hypothetical Document Embeddings) - Uses LLMs to generate hypothetical documents for queries
- FLARE (Forward-Looking Active REtrieval) - Iteratively retrieves relevant documents based on prediction confidence
- Self-RAG - Trains LLMs to adaptively retrieve passages and self-critique
- CRAG (Corrective Retrieval Augmented Generation) - Improves generation robustness with retrieval evaluator
- RAG Techniques Repository - Curated collection of 30+ advanced RAG techniques with implementations
- Design and Evaluation of RAG Solutions - Comprehensive guide following best practices
- LangChain RAG Best Practices - Evaluation and comparison of different RAG architectures
- RAG Triad Methodology - Context relevance, groundedness, and answer relevance framework
- Agentic RAG
- Corrective RAG (CRAG)
- Cache‑Augmented Generation
- Temporal‑Aware RAG - Binary duadic codes and their related codes with a square-root-like lower bound
- Plan‑then‑RAG - A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makers
- RePlug — Retriever‑aware generation
- RETRO — Retro‑fitted retrieval
- Streaming RAG — Low latency
Multimodal RAG
- Multimodal RAG with CLIP - Text-Image retrieval using CLIP
- SAM-RAG - Self-adaptive multimodal RAG framework
- ColPali - Efficient document retrieval with vision language models
- Building Multimodal RAG Systems
Graph-based RAG
- Microsoft GraphRAG - Knowledge graph approach to RAG Research: GraphRAG Paper
- Knowledge Graph Integration for RAG
- Neo4j GraphRAG - Building knowledge graphs for RAG
Retrieval Methods
Dense Retrieval
Sparse Retrieval
- SPLADE: Sparse Lexical and Expansion Model - Neural sparse retrieval with term expansion
- HNSW vs DiskANN
Hybrid Search
- Hybrid Search: Combining Dense and Sparse Retrieval - Implementation guide for hybrid search systems
- Dense‑Sparse‑Dense (DSD)
- Advanced Reranking Techniques - Guide to implementing cross-encoder reranking.
More here: All RAG Reranking (GitHub)
Other Techniques
- RAG Fusion
- Sentence Window Retrieval
- Cross‑Encoder Re‑Ranking
- Gemini Small‑to‑Big Retriever
- Multi‑Vector Retrieval
- Negative PRF (Pseudo‑Relevance)
Prompting Strategies
-
RAG Prompt Engineering Guide (DAIR.AI) - Comprehensive guide to prompt engineering for RAG systems
-
LangChain RAG Prompt Hub - Collection of tested RAG prompt templates
-
Efficient Prompt Engineering for RAG - Strategies for optimizing prompts in RAG systems
-
Secure RAG applications using prompt engineering on Amazon Bedrock - Best practices for RAG prompts with security considerations
- Zero‑Shot / Few‑Shot
- Chain‑of‑Thought (CoT)
- Meta Prompting
- Generated Knowledge Prompting
- ReAct
- Reflexion
- Automatic Prompt Engineer (APE)
- Directional Stimulus Prompting (DSP)
- Chain‑of‑Verification (CoVe)
- Self‑Consistency
- Prompt Compression
- Dynamic / Adaptive Prompts
- System → Retrieval → User triple‑prompt
- GraphPrompt
- Emerging RAG & Prompt Engineering Architectures for LLMs
- How to Cut RAG Costs by 80% Using Prompt Compression
Chunking & Pre‑processing
- 11 Chunking Strategies for RAG — Simplified & Visualized - Comprehensive guide covering 11 chunking methods with visual comparisons
- 5 Levels of Text Splitting - Hierarchical approach to chunking from basic to advanced
- Semantic Chunking with LlamaIndex - Implementation guide for semantic-based document splitting
- Optimizing Retrieval-Augmented Generation with Advanced Chunking Techniques - Research on optimal chunk sizes for different use cases
- CharacterTextSplitter — fixed‑size
- RecursiveTextSplitter
- SentenceSplitter (LlamaIndex)
- Unstructured‑IO loaders
- LoRA Chunking - Fused kernel chunk loss to include LoRA to reduce memory, support DeepSpeed ZeRO3
- Semantic chunking video
- Agentic chunking demo - The 5 Levels Of Text Splitting For Retrieval
- Chunking Strategies for LLM Applications
- Evaluating the Ideal Chunk Size for a RAG System using LlamaIndex
- How to Chunk Text Data — A Comparative Analysis
Comparison Guides
-
Vector Database Comparison: Pinecone vs Weaviate vs Chroma - Comprehensive enterprise-focused comparison with performance metrics
-
Top Vector Database for RAG: Qdrant vs Weaviate vs Pinecone - Performance comparison of 6 vector databases for RAG workloads
Embeddings Models
-
Embedding Model Comparison: OpenAI vs Cohere vs Open Source - Comprehensive evaluation of commercial and open-source embedding models
-
Best Embedding Model — OpenAI / Cohere / Google / E5 / BGE - Detailed comparison of top embedding models with performance metrics
-
Matryoshka Embeddings for RAG - Implementing variable-size embeddings for efficiency
-
BGE M3 and SPLADE Implementation Guide - Guide to implementing sparse and dense embeddings
Instruction Tuning & Optimization
- RA‑DIT
- InstructRetro
- FLARE / Active RAG
- UltraFeedback — RLHF on RAG
- DSI‑T — Decoder‑only retrieval
Finetuning and Training
- Fine-Tuning Llama 2.0 with Single GPU Magic
- Practitioners guide to fine-tune LLMs for domain-specific use case
- Are You Pre-training your RAG Models on Your Raw Text?
- Combine Multiple LoRA Adapters for Llama 2
- RAG vs Finetuning — Which Is the Best Tool to Boost Your LLM Application?
Response Quality, and Hallucination
- RAGTruth: A Hallucination Corpus - Dataset with 18,000 RAG responses and hallucination annotations
- Reducing Hallucination in Structured Outputs via RAG
- WhyLabs AI Control Center - Platform for real-time guardrails and monitoring
- Vectara Hallucination Score
- Prompt‑Injection Defense
- OpenAI Function Calling JSON Schema
- Harmless RLHF pipelines
- in-Of-Verification Reduces Hallucination in LLMs
- How to Detect Hallucinations in LLMs
- Measuring Hallucinations in RAG Systems
Security and Privacy Considerations
- OWASP Top 10 for LLM Applications - Comprehensive security framework covering RAG vulnerabilities
- CSA RAG Security Best Practices - Enterprise-grade security controls for RAG
- Microsoft Presidio for PII Protection - Framework for detecting and anonymizing sensitive information
- LLM Guard - Security toolkit for protecting LLM applications
- Masking PII Data in RAG Pipeline
- Hijacking Chatbots: Dangerous Methods Manipulating GPTs
- Guardrails AI - Framework for implementing security guardrails
- NVIDIA NeMo Guardrails - Comprehensive toolkit for building programmable guardrails
- NeMo Guardrails: The Missing Manual
- Safeguarding LLMs with Guardrails
Evaluation Metrics and Benchmarks
- RAGAS (Retrieval-Augmented Generation Assessment) - Reference-free evaluation framework with component-level metrics
- TruLens - Comprehensive evaluation and tracking for LLM applications
- DeepEval - Open-source evaluation framework for LLMs
- Arize Phoenix - Open-source observability platform
- RAGBench - 100k examples across 5 industry domains
- BeIR - Benchmark for zero-shot evaluation of information retrieval
- MTEB - Massive Text Embedding Benchmark
- ARES - Automated Evaluation of RAG Systems
- RGB Benchmark - implementation for Benchmarking Large Language Models in Retrieval-Augmented Generation
- LlamaIndex RAG eval - Evaluation and benchmarking are crucial in developing LLM applications
Blogs
- RAG Evaluation
- Evaluating RAG: A journey through metrics
- Exploring End-to-End Evaluation of RAG Pipelines
- Evaluation Driven Development, the Swiss Army Knife for RAG Pipelines
- Evaluating the Ideal Chunk Size for a RAG System using LlamaIndex
RAG Benchmark 2023
- Benchmarking Large Language Models in Retrieval-Augmented Generation
- RECALL: A Benchmark for LLMs Robustness against External Counterfactual Knowledge
- ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems
- RAGAS: Automated Evaluation of Retrieval Augmented Generation
RAG Benchmark 2024
- CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models
- FeB4RAG: Evaluating Federated Search in the Context of Retrieval Augmented Generation
- CodeRAG-Bench: Can Retrieval Augment Code Generation?
- Long2RAG: Evaluating Long-Context & Long-Form Retrieval-Augmented Generation with Key Point Recall
Advantages and Disadvantages
- Advantages overview
- Disadvantages & pitfalls
- RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study
Performance, Cost & Observability
- Vector Database Optimization - Techniques for efficient vector storage and retrieval
- Hybrid Retrieval Strategies - Combining multiple retrieval methods for better performance
- Chunking Optimization - Strategies for optimal text segmentation
- LangFuse
- LangSmith
- Helicone — telemetry
- WandB RAG guide
- OpenLLMetry - Open-source observability for your LLM application, based on OpenTelemetry
- Cost optimisation tips
Cost Calculators
- RAG Cost Calculator - Tool for estimating and optimizing RAG pipeline costs
- RAG Savings Calculator
- RAG Cost Calculator
RAG Fine-tuning
- RAFT (Retrieval Augmented Fine-Tuning) - Adapting Language Model to Domain Specific RAG
- Fine-tuning vs RAG Guide - Comprehensive comparison and guidance
- Direct Preference Optimization (DPO) for RAG - Alternative to RLHF for aligning RAG outputs
Knowledge‑Graph / Structured RAG
- DBpedia - Structured knowledge from Wikipedia
- Wikidata - Community-maintained knowledge base
- ConceptNet - Large-scale commonsense knowledge graph
- YAGO - High-quality knowledge base
- Neo4j LLM Knowledge Graph Builder
- Neo4j RAG blog
- GraphRAG site
- NebulaGraph Graph‑RAG article
Libraries and SDKs
- Sentence Transformers - Python framework for sentence, text and image embeddings
- LiteLLM - Python SDK for 100+ LLM APIs in OpenAI format
- AI SDK - TypeScript toolkit for building AI applications
- Hugging Face Transformers - State-of-the-art ML for PyTorch, TensorFlow, and JAX
Key Concepts
- Hugging Face Transformers - RAG Documentation
- RAG-Survey GitHub Repository - Curated collection of RAG papers with taxonomy
Educational Content
Courses and Tutorials
- Building Advanced RAG (DeepLearning.AI)
- RAG from Scratch (FreeCodeCamp)
- IBM Generative AI and RAG Course (Coursera)
- MAGMaR 2024 — Multimodal Augmented Generation (NeurIPS)
- ACL 2024 Knowledgeable LMs Tutorial
- SIGIR 2023 Generative IR Workshop
-
Modular RAG and RAG Flow Yunfan Gao (2024) Tutorial - Blog I and Blog II
-
Stanford CS25: V3 I Retrieval Augmented Language Models Douwe Kiela (2023) Lecture - Video
-
Building RAG-based LLM Applications for Production Anyscale (2023) Tutorial - Blog
-
Multi-Vector Retriever for RAG on tables, text, and images LangChain (2023) Tutorial - Blog
-
Retrieval-based Language Models and Applications Asai et al. (2023) Tutorial ACL Website and Video
-
Advanced RAG Techniques: an Illustrated Overview Ivan Ilin (2023) Tutorial - Blog
-
Retrieval Augmented Language Modeling Melissa Dell (2023) Lecture Video
- RAG from scratch
- Building RAG Applications for Production
- Stanford CS25: V3 I Retrieval Augmented Language Models
- Ray Summit 2024: Production RAG Pipelines
- Haystack RAG Workshop
- Azure Cognitive Search RAG Tutorial
Blogs and Articles
- RAG Implementation with LangChain and Weaviate - From theory to Python implementation
- Advanced RAG Techniques: An Illustrated Overview
- The RAGOps Stack: Critical Components
- Knowledge Graphs for RAG
- RAG Intuitively & Exhaustively Explained
- RAG in Production: 9 Lessons
- Reranking vs Embeddings on Cursor
- Forget RAG, Think RAG‑Fusion
- Hidden Costs of RAG
Newsletters & Forums
- ragaboutit - A blog and newsletter focused specifically on RAG news, tutorials, and insights, making it a dedicated resource for staying up-to-date.
- r/LangChain
- r/rag - Reddit communities for practical discussions, troubleshooting, and sharing projects. These are valuable for seeing what challenges other developers are facing in real-time.
Talks and Conferences
- Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks (NeurIPS 2020)
- Self-RAG: Learning to Retrieve, Generate, and Critique (ICLR 2024)
- RAG Research Papers Collection - Curated list from ICML, ICLR, ACL
Influential Researchers and Influencers
- Patrick Lewis - Lead author of original RAG paper, AI Research Scientist at Cohere
- Sebastian Riedel - Co-author of RAG paper, Professor at UCL and DeepMind
- Douwe Kiela - Co-author of RAG paper, CEO of Contextual AI
- Gautier Izacard - Author of FiD and Atlas papers, Meta AI
- Kelvin Guu - Lead author of REALM paper, Google Research
- Douwe Kiela — Modular RAG, Stanford
- Matei Zaharia — DSPy, Databricks
- Akari Asai — Dense retrieval research
- Jerry Liu — LlamaIndex
- Harrison Chase — LangChain
- Andrej Karpathy — LLM systems
- Jeff Dean — Google Research
- Artem Yankov — Qdrant
- Alden Do Rosario - RAG Influencer, CEO
CustomGPT.ai
Latest Trends 2024-2025
- RAG-as-a-Service market at $1.2B (2024)
- Projected 49.1% CAGR through 2030
- On-device RAG for privacy
Community Resources
- r/LocalLLaMA - 493k members
- r/MachineLearning - Active RAG discussions
- r/RAG - Dedicated RAG subreddit
Discord
- RAG TAGG Discord - 2,492 members
- Vectara RAGTime Bot
- RAGHub Community
GitHub Communities
Existing Collections
- Awesome-RAG (awesome-rag)
- RAG Resources (mrdbourke)
- RAG Techniques (NirDiamant)
- Danielskry / Awesome‑RAG
- frutik / Awesome‑RAG
- coree / awesome‑rag
- jxzhangjhu / Awesome‑LLM‑RAG
- SJTU‑DMTai / awesome‑rag
- lucifertrj / Awesome‑RAG
Contributing
Contributions are welcome! Please read the contribution guidelines before submitting a pull request.
License
This collection is licensed under MIT.