Skip to content

A curated list of resources dedicated to retrieval-augmented generation (RAG).

License

Notifications You must be signed in to change notification settings

gomate-community/awesome-papers-for-rag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

awesome-papers-for-rag

A curated list of resources dedicated to retrieval-augmented generation (RAG).

The retrieval-augmented generation (RAG) is to combine the merits of retrieval system and llm to generation high-quality answers for users.

RAG Framework

The Framework for RAG System

Typically, the rag system consists of a set of modules, where each task are described as follows:

Components Input Output Tasks
Intent Clarify question search queries Query performance prediction, Query (intent) classification, Query expasion, et al.
Retrieval question/queries documents/passages Ad-hoc retrieval, Document retrieval, Passage retrieval, et al.
Mediation questions+documents contexts Re-ranking, Context compression, post-retrieval, et al.
Generation question+contexts answer Question answering, summarization, et al.
Result Enhancement question+answer+contexts answer Claim verification, Attribution, et al.

Healthcheck

pip3 install -r requirements.txt
python3 healthcheck.py

Surveys for RAG

  • The Organization column only record the organization of the first author.
Date Title Organization Code
2024/09/16 Trustworthiness in Retrieval-Augmented Generation Systems: A Survey Tsinghua University Code</br>
2024/09/10 Graph Retrieval-Augmented Generation: A Survey Peking University Code</br>
2024/02/29 Retrieval-Augmented Generation for AI-Generated Content: A Survey Peking University Code</br>
2024/01/03 Retrieval-Augmented Generation for Large Language Models: A Survey Tongji University Code</br>
2024/01/03 A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models Islamic University of Technology No
2023/12/07 Trends in Integration of Knowledge and Large Language Models: A Survey and Taxonomy of Methods, Benchmarks, and Applications Harbin Institute of Technology No
2023/09/19 The Rise and Potential of Large Language Model Based Agents: A Survey Fudan NLP Group Code</br>
2023/08/14 Large Language Models for Information Retrieval: A Survey Renmin University Code</br>
2022/02/02 A Survey on Retrieval-Augmented Text Generation Nara Institute of Science and Techonology No

Systems for RAG

  • The Organization column only record the organization of the first author.
Date Title Organization Code
2024/11/07 LightRAG: Simple and Fast Retrieval-Augmented Generation BUPT Code
2024/10/25 StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization ISCAS Code
2024/08/21 RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation Nanjing University Code
2024/07/11 Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting University of California, San Diego No
2024/06/19 InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales University of Virginia Code
2024/05/22 FlashRAG: A Modular Toolkit for Efficient Retrieval-Augmented Generation Research Renmin University of China Code
2024/04/24 From Local to Global: A Graph RAG Approach to Query-Focused Summarization Microsoft Code
2023/11/22 FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation Google Code
2023/11/08 PDFTriage: Question Answering over Long, Structured Documents Stanford Code
2023/10/27 WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia Stanford Code
2023/10/27 LeanDojo: Theorem Proving with Retrieval-Augmented Language Models Caltech Code
2023/06/13 WebGLM: Towards An Efficient Web-Enhanced Question Answering System with Human Preferences Tsinghua University Code
2023/05/23 WebCPM: Interactive Web Search for Chinese Long-form Question Answering Tsinghua University Code
2022/06/01 WebGPT: Browser-assisted question-answering with human feedback Open AI No

Evaluations for RAG

  • The Organization column only record the organization of the first author.
Date Title Organization Code
2024/10/10 HELMET: How to Evaluate Long-Context Language Models Effectively and Thoroughly Princeton Code
2024/08/16 RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models NewsBreak Code </br>
2024/08/16 RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation Amazon Code </br>
2024/05/13 Evaluation of Retrieval-Augmented Generation: A Survey Tencent Code
2024/04/21 Evaluating Retrieval Quality in Retrieval-Augmented Generation UMASS No
2024/04/08 FaaF: Facts as a Function for the evaluation of generated text IMMO Capital Code
2024/02/19 CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models University of Science and Technology of China Code
2024/1/30 RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture Microsoft No
2024/1/11 Seven Failure Points When Engineering a Retrieval Augmented Generation System Applied Artificial Intelligence Institute No
2023/12/20 Benchmarking Large Language Models in Retrieval-Augmented Generation Chinese Information Processing Laboratory Code
2023/11/16 ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems Stanford Code
2023/11/14 RECALL: A Benchmark for LLMs Robustness against External Counterfactual Knowledge Peking University No
2023/10/31 Enabling Large Language Models to Generate Text with Citations Princeton University Code
2023/09/26 RAGAS: Automated Evaluation of Retrieval Augmented Generation Exploding Gradients Code
2021/08/05 TruLens:Evaluation and Tracking for LLM Experiments TruEra Code

About

A curated list of resources dedicated to retrieval-augmented generation (RAG).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages