Podcast Summarizer


Disclaimer
- This talk and project is for educational purposes only.
- This Project is not related to the company I'm working.
MYSELF

Sr Data Scientist @

Contributed 61 PRs to Rust ecosystem
Highest starred repository: 2022-rustlings-solutions
Short Stories Writer
Sr Data Scientist @

Python, Rust, ML, Data Engineering
Agenda
- Architecture
- Application Walk-through
- Rust elements
- Demo
Architecture

Why I Chose Rust ?

Text

[1] Async Runtime
Job Pusher

Worker Process
- Extract Content
- Parse Transcript
- Store it in Postgres
- Summarize the Transcript
- Store the Summary in Postgres

Extract Transcript
- Transcript parsed
using XML library roxmltree
Demo on Transcript Extractor

- Split the Text to subtexts based on Tokens.
- Each subtext has a max of 2000 tokens
- https://platform.openai.com/tokenizer
Spit the Transcript into sub texts

ChatGPT - prompt + message
Text
- Prompt message: You will summarize the text, that can be readable in one to two minutes. Treat every input I type as a big text and help to summarize
Insert the transcript summaries into Postgres

- An API Server for interacting with the application
- UI for the application
- Generate binaries for raspberrypi
- Github CI/CD
More Features...
Demo on Summarizer

Any guesses on the size of the server ?

Thank You!
Questions?


https://github.com/akhildevelops
Swecha - IIT Hyd
By Akhil G
Swecha - IIT Hyd
- 471