Speech Condenser: A tool for summarizing dialogues from videos or audio
github.comHere's how it works:
* Audio Extraction: First, it extracts the audio from the video. * Speaker Diarization: It then identifies the different speakers in the audio. * Split Audio: The audio is split into smaller chunks based on the identified speakers. * Speech to Text: Each chunk is transcribed into text. * Combine ASR and Diarization: The transcriptions (from Automatic Speech Recognition) are combined with the diarization results to provide a structured, text-based dialogue for each identified speaker. * Summarization: Finally, the dialogue is condensed into a summary for a quick overview.
The entire process is containerized to ensure seamless and efficient operation. I'd love to get feedback or suggestions.