The-Oracle-of-Hogwarts

Overview

This project demonstrates the implementation of a Retrieval-Augmented Generation (RAG) model that utilizes state-of-the-art AI tools to generate text responses based on a corpus of books. The project leverages LangChain for data loading and preprocessing, OpenAI's embedding models for transforming text into vector space, and Pinecone to handle the vector database operations. The model is designed to answer queries by generating relevant text using the GPT-4 model, ensuring contextually rich and accurate responses.

Detrailed Flowchart of RAG system

How It Works

Data Loading and Preprocessing:
- The text data from books is loaded using PyPDFDirectoryLoader from LangChain, which handles the extraction of text from PDF files stored in a directory structure.
- Text data is then segmented into manageable chunks using LangChain's RecursiveTextSplitter, which splits the text based on logical divisions within the content.
Embedding Text into Vector Space:
- Each text chunk is embedded into a high-dimensional vector space using OpenAI's text-embedding-3-small embedding models. This transformation allows us to perform semantic search on the text data.
Storing and Retrieving Data:
- The generated embeddings, along with their associated text chunks, are stored in Pinecone, a vector database optimized for scalability and fast retrieval.
- For a given query, the system first converts the text of the query into its vector representation.
Generating Responses:
- Using the query's vector, the system retrieves the most relevant text chunks from Pinecone.
- These chunks are then fed into the GPT-4 model along with the query to generate coherent and contextually relevant text responses.

Technologies Used

LangChain: For loading and preprocessing text data from books.
OpenAI Embedding text-embedding-3-small Model: For converting text into embeddings.
Pinecone: For storing and retrieving vector data efficiently.
OpenAI GPT-4: For generating text based on retrieved context.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
assets		assets
.DS_Store		.DS_Store
Generator.ipynb		Generator.ipynb
LICENSE		LICENSE
README.md		README.md
Retreiver.ipynb		Retreiver.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The-Oracle-of-Hogwarts

Overview

Detrailed Flowchart of RAG system

How It Works

Technologies Used

About

Releases

Packages

Languages

License

AnanthaPadmanaban-KrishnaKumar/The-Oracle-of-Hogwarts

Folders and files

Latest commit

History

Repository files navigation

The-Oracle-of-Hogwarts

Overview

Detrailed Flowchart of RAG system

How It Works

Technologies Used

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages