File size: 420 Bytes
94dc5bd
 
 
 
 
 
 
 
 
6aaf9ed
94dc5bd
6aaf9ed
94dc5bd
6aaf9ed
94dc5bd
6aaf9ed
94dc5bd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
PIPELINE:
1. Pick language
2. Input image/text/audio
3. give input to chatbot which explains the words and structures
4. also parse words and create flashcards for all of the words
5. show flashcards and give option to add to anki

TASKS NEEDED:
1. image - text (OCR)
  - GOT-OCR (716M parameters)
2. text - text (chatbot)
  - chatgpt 4o
3. audio - text
  - whisper
4. chatbot to explain
  - chatgpt 4o
5. text to speech