Simple RAG From Scratch#

In this tutorial, we will use BGE, Faiss, and OpenAI’s GPT-4o-mini to build a simple RAG system from scratch.

0. Preparation#

Install the required packages in the environment:

%pip install -U numpy faiss-cpu FlagEmbedding openai

1. Data#

Suppose I’m a resident of New York Manhattan, and I want the AI bot to provide suggestion on where should I go for dinner. It’s not reliable to let it recommend some random restaurant. So let’s provide a bunch of our favorate restaurants.

corpus = [
    "Cheli: A downtown Chinese restaurant presents a distinctive dining experience with authentic and sophisticated flavors of Shanghai cuisine. Avg cost: $40-50",
    "Masa: Midtown Japanese restaurant with exquisite sushi and omakase experiences crafted by renowned chef Masayoshi Takayama. The restaurant offers a luxurious dining atmosphere with a focus on the freshest ingredients and exceptional culinary artistry. Avg cost: $500-600",
    "Per Se: A midtown restaurant features daily nine-course tasting menu and a nine-course vegetable tasting menu using classic French technique and the finest quality ingredients available. Avg cost: $300-400",
    "Ortomare: A casual, earthy Italian restaurant locates uptown, offering wood-fired pizza, delicious pasta, wine & spirits & outdoor seating. Avg cost: $30-50",
    "Banh: Relaxed, narrow restaurant in uptown, offering Vietnamese cuisine & sandwiches, famous for its pho and Vietnam sandwich. Avg cost: $20-30",
    "Living Thai: An uptown typical Thai cuisine with different kinds of curry, Tom Yum, fried rice, Thai ice tea, etc. Avg cost: $20-30",
    "Chick-fil-A: A Fast food restaurant with great chicken sandwich, fried chicken, fries, and salad, which can be found everywhere in New York. Avg cost: 10-20",
    "Joe's Pizza: Most famous New York pizza locates midtown, serving different flavors including classic pepperoni, cheese, spinach, and also innovative pizza. Avg cost: $15-25",
    "Red Lobster: In midtown, Red Lobster is a lively chain restaurant serving American seafood standards amid New England-themed decor, with fair price lobsters, shrips and crabs. Avg cost: $30-50",
    "Bourbon Steak: It accomplishes all the traditions expected from a steakhouse, offering the finest cuts of premium beef and seafood complimented by wine and spirits program. Avg cost: $100-150",
    "Da Long Yi: Locates in downtown, Da Long Yi is a Chinese Szechuan spicy hotpot restaurant that serves good quality meats. Avg cost: $30-50",
    "Mitr Thai: An exquisite midtown Thai restaurant with traditional dishes as well as creative dishes, with a wonderful bar serving cocktails. Avg cost: $40-60",
    "Yichiran Ramen: Famous Japenese ramen restaurant in both midtown and downtown, serving ramen that can be designed by customers themselves. Avg cost: $20-40",
    "BCD Tofu House: Located in midtown, it's famous for its comforting and flavorful soondubu jjigae (soft tofu stew) and a variety of authentic Korean dishes. Avg cost: $30-50",
]

user_input = "I want some Chinese food"

2. Indexing#

Now we need to figure out a fast but powerful enough method to retrieve docs in the corpus that are most closely related to our questions. Indexing is a good choice for us.

The first step is embed each document into a vector. We use bge-base-en-v1.5 as our embedding model.

from FlagEmbedding import FlagModel

model = FlagModel('BAAI/bge-base-en-v1.5',
                  query_instruction_for_retrieval="Represent this sentence for searching relevant passages:",
                  use_fp16=True)

embeddings = model.encode(corpus, convert_to_numpy=True)
embeddings.shape
(14, 768)

Then, let’s create a Faiss index and add all the vectors into it.

If you want to know more about Faiss, refer to the tutorial of Faiss and indexing.

import faiss
import numpy as np

index = faiss.IndexFlatIP(embeddings.shape[1])

index.add(embeddings)
index.ntotal
14

3. Retrieve and Generate#

Now we come to the most exciting part. Let’s first embed our query and retrieve 3 most relevant document from it:

q_embedding = model.encode_queries([user_input], convert_to_numpy=True)

D, I = index.search(q_embedding, 3)
res = np.array(corpus)[I]

res
array([['Cheli: A downtown Chinese restaurant presents a distinctive dining experience with authentic and sophisticated flavors of Shanghai cuisine. Avg cost: $40-50',
        'Da Long Yi: Locates in downtown, Da Long Yi is a Chinese Szechuan spicy hotpot restaurant that serves good quality meats. Avg cost: $30-50',
        'Yichiran Ramen: Famous Japenese ramen restaurant in both midtown and downtown, serving ramen that can be designed by customers themselves. Avg cost: $20-40']],
      dtype='<U270')

Then set up the prompt for the chatbot:

prompt="""
You are a bot that makes recommendations for restaurants. 
Please be brief, answer in short sentences without extra information.

These are the restaurants list:
{recommended_activities}

The user's preference is: {user_input}
Provide the user with 2 recommended restaurants based on the user's preference.
"""

Fill in your OpenAI API key below:

import os

os.environ["OPENAI_API_KEY"] = "YOUR_API_KEY"

Finally let’s see how the chatbot give us the answer!

from openai import OpenAI
client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {
            "role": "user",
            "content": prompt.format(user_input=user_input, recommended_activities=res)
        }
    ]
).choices[0].message
print(response.content)
1. Cheli - Authentic Shanghai cuisine with sophisticated flavors.  
2. Da Long Yi - Szechuan spicy hotpot with good quality meats.