RAG

RAG

Retrieval-Augmented Generation (RAG) is a technique that enhances LLMs by retrieving relevant information from external knowledge bases before generating an answer.

Evaluating RAG systems is a two-step process: 1. Retrieval: Did the system find the right document? 2. Generation: Did the system answer the question correctly using that document?

Datasets

To evaluate a RAG system, you'll need to build a dataset. You can create this in a spreadsheet (Google Sheets, Excel) or using Pandas.

Your dataset should include the following columns:

Column Type Description
input Required The user message or question.
context Required The retrieved text or documents used to answer the question.
ground_truth Required The ideal or correct answer.
system_prompt Optional The instructions given to the model.
output Optional The actual response from the model. If not provided, you can select a model in the platform to generate answers for you.