Skip to content
/ Docai Public

GPT-3 based Question Answering System that reads text from PDF, DOCX, or TXT files and answers questions based on the content.

Notifications You must be signed in to change notification settings

obaskly/Docai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 

Repository files navigation



Docai

Docai is a GPT-3 based Question Answering System that can provide answers based on a PDF, DOCX, and TXT files.

Key FeaturesHow To UseRequirementsCopyright

Key Features

  • File handling
    • The script supports PDF, DOCX, and TXT files
    • Read the content using the pdfplumber, docx, and built-in open() functions
  • GPT-3 integration
    • The script uses the OpenAI GPT-3 model, specifically the text-davinci-003 engine, to generate answers to questions.
  • Confidence scoring
    • The script calculates confidence scores for the generated answers using log probabilities returned by the GPT-3 API.
  • Concurrency
    • It uses the concurrent.futures.ThreadPoolExecutor to process questions concurrently, potentially speeding up the process.
  • Text preprocessing
    • The script splits the input document into chunks to fit within GPT-3's token limit, and post-processes the answers to remove duplicate sentences.
  • Saving conversation history
    • The script allows users to save the conversation history to a text file.
  • Caching
    • The script uses lru_cache decorator to cache the answers generated by GPT-3. This way, if a user asks the same question again, the cached answer can be returned instead of making another API call.
  • Gui
    • The script provides a friendly graphical user interface built using the tkinter library and ttkthemes allowing users to select a file, input a question, view the answer, and save the conversation history.

How To Use

  • Put you api key in line 45
  • Run the script
  • Select your file
  • Enter your question and click submit

It's as simple as that

Note We will provide an executable version soon

Requirements

pip install openai pdfplumber python-docx

Copyright

All rights reserved to Bropocalypse Team.

About

GPT-3 based Question Answering System that reads text from PDF, DOCX, or TXT files and answers questions based on the content.

Topics

Resources

Stars

Watchers

Forks

Languages