RAG for a beginner by ChatGPT

RAG is Retrieval Augmented Generation. It is an AI-technique that uses an integrated knowledge base for the information retrieval. The retrieved information in turn is used by the LLM as a context to generate the response for the user input. No tricks behind the title of this story other than just what I did. As a beginner, I did not know where to start, so I started exploring some options to learn about RAG. I came across langchain, faiss, langflow and other such libraries. Yet I wasn’t able to tie all the ends, so I did seek the knowledge of ChatGPT :D I got the response of the entire code to implement RAG using open-source libraries and then passing the context to LLM (with Groq API for llama-3–70b). Let us go through the code and understand it. Folder Structure rag_groq/ │ ├── main.py # Entry point ├── ingest.py # Load & embed PDF content ├── query.py # Ask questions using Groq API ├── config.py # Configuration (Groq key, model, etc.) ├── sample.pdf # Your test document └── requirements.txt # Required Python libraries config.py # config.py GROQ_API_KEY = "gsk_*" GROQ_MODEL = "llama3-70b-8192" # or another model hosted on Groq EMBEDDING_MODEL = "sentence-transformers/all-MiniLM-L6-v2" VECTOR_DB_PATH = "vector_store" Groq API key is required to access the llama-3 model to infer. We define the embedding model to convert the document content into embeddings and then store the chunks. We store the index for the chunks using faiss and it will be saved to the vector db path defined here. main.py from ingest import build_vector_store from query import query_pdf if __name__ == "__main__": print("

Apr 22, 2025 - 05:01
 0
RAG for a beginner by ChatGPT

RAG is Retrieval Augmented Generation. It is an AI-technique that uses an integrated knowledge base for the information retrieval. The retrieved information in turn is used by the LLM as a context to generate the response for the user input.

Image generated by AI (Grok AI)

No tricks behind the title of this story other than just what I did. As a beginner, I did not know where to start, so I started exploring some options to learn about RAG. I came across langchain, faiss, langflow and other such libraries. Yet I wasn’t able to tie all the ends, so I did seek the knowledge of ChatGPT :D
I got the response of the entire code to implement RAG using open-source libraries and then passing the context to LLM (with Groq API for llama-3–70b). Let us go through the code and understand it.

Folder Structure

rag_groq/

├── main.py                  # Entry point
├── ingest.py                # Load & embed PDF content
├── query.py                 # Ask questions using Groq API
├── config.py                # Configuration (Groq key, model, etc.)
├── sample.pdf               # Your test document
└── requirements.txt         # Required Python libraries

config.py

# config.py
GROQ_API_KEY = "gsk_*"
GROQ_MODEL = "llama3-70b-8192"  # or another model hosted on Groq
EMBEDDING_MODEL = "sentence-transformers/all-MiniLM-L6-v2"
VECTOR_DB_PATH = "vector_store"

Groq API key is required to access the llama-3 model to infer. We define the embedding model to convert the document content into embeddings and then store the chunks. We store the index for the chunks using faiss and it will be saved to the vector db path defined here.

main.py

from ingest import build_vector_store
from query import query_pdf

if __name__ == "__main__":
    print("