dikaletus - NFHN Reader

A meeting agent script to record, transcribe, and summarise meetings using FFmpeg, PulseAudio and the Mistral AI API.

Meeting Agent

A script to record, transcribe, and generate structured meeting notes using FFmpeg, PulseAudio and the Mistral AI API.

Overview

The Meeting Agent automates the process of capturing, transcribing, and generating structured meeting notes. It records audio from both microphone and speaker outputs, transcribes the audio using Mistral's speech-to-text API, and generates structured meeting notes in Markdown format.

Features

Audio Recording: Captures audio from both microphone and speaker outputs using PulseAudio
Transcription: Uses Mistral's voxtral-mini-latest model to transcribe audio
Meeting Notes: Generates structured meeting notes using Mistral's mistral-medium-latest model
Configuration: Saves API key and settings between runs
Flexible Input: Record live audio or process existing audio/video files
Terminal UI: Interactive TUI mode for easy configuration and workflow selection
Breeze Dark Theme: TUI styled with dark background, light text, blue accents, matching KDE Breeze Dark
Context Biasing: Improve transcription accuracy by providing domain-specific terminology

Requirements

System Requirements

Linux with PulseAudio (for audio recording)
Bash shell (for TUI input handling)
FFmpeg (for audio recording and processing)
Standard terminal emulator with 256-color support

R Runtime

R (version 4.0 or later)

Required R Packages

httr - HTTP requests for Mistral API
jsonlite - JSON configuration handling
rmarkdown - Markdown processing
processx - Process management for audio recording
argparse - Command line argument parsing
cli - Terminal UI styling and formatting (for TUI mode)

Installation

1. Clone or Download

Clone this repository or download the script files.

2. Install Required R Packages

Rscript -e "install.packages(c('httr', 'jsonlite', 'rmarkdown', 'processx', 'argparse', 'cli'), repos = 'https://cloud.r-project.org')"
"

### 3. Install System Dependencies

**Ubuntu/Debian:**
```bash
sudo apt-get install pulseaudio ffmpeg

Fedora/RHEL:

sudo dnf install pulseaudio ffmpeg

Arch Linux:

sudo pacman -S pulseaudio ffmpeg

Usage

Terminal User Interface (Recommended)

Launch the interactive TUI for guided setup and workflow selection:

Rscript meeting_agent.R

The TUI provides:

Main menu with workflow options (Record Audio, Use Existing File, Settings, Exit)
Settings menu for API key, template path, and output directory configuration
Real-time feedback with styled alerts (info, success, warning, danger)
Breeze Dark compatible color scheme

Command Line Mode

First Run (API key required)

Rscript meeting_agent.R --api-key YOUR_API_KEY --record

Subsequent Runs (uses saved config)

Rscript meeting_agent.R --record

Use Existing Audio/Video File

Rscript meeting_agent.R --file /path/to/audio.mp4

Custom Template and Output Directory

Rscript meeting_agent.R --template-path /path/to/meeting_blueprint.md --output-dir /path/to/output --record

Command Line Options

Option	Description
`--api-key`	Mistral API key (required on first run)
`--tui`	Launch interactive Terminal User Interface
`--no-tui`	Force command-line mode (disables TUI even if stdin is a terminal)
`--record`	Record audio from microphone and speakers
`--file`	Use specified audio/video file instead of recording
`--template-path`	Path to markdown template file
`--output-dir`	Base path for output directories
`--context`	Path to context bias text file for improved transcription accuracy

TUI Configuration

In TUI mode, use the Settings menu to configure:

API Key: Your Mistral AI API key (required for all API calls)
Template Path: Path to your meeting notes template (Markdown format)
Output Directory: Base directory for output files (timestamped subdirectories are created per workflow)
Context Bias File: Path to a text file containing domain-specific terms to improve transcription accuracy

Configuration is saved to config.json and loaded automatically on subsequent runs.

Context Bias File

The context bias feature improves transcription accuracy by providing the Mistral transcription API with domain-specific terminology that is likely to appear in your meetings.

Creating a Context Bias File

Create a plain text file containing words and phrases that are specific to your domain. Each entry should be on its own line or separated by commas. Multi-word entries should use underscores to join words (e.g., project_manager instead of project manager).

Example context bias file (bias.txt):

meeting,minutes,action_item,project_manager,scrum,sprint,retrospective,standup,dikaletus

Setting the Context Bias File

In TUI Mode:

Select "Settings" from the main menu
Select "Set Context Bias File"
Enter the path to your context bias file
The file path will be saved and used for all subsequent transcriptions

In Command Line Mode:

Rscript meeting_agent.R --context /path/to/bias.txt --record

What Context Bias Does

When a context bias file is configured, its contents are sent to the Mistral /audio/transcriptions API endpoint as the context_bias parameter. This helps the transcription model recognize and accurately transcribe domain-specific terms, proper nouns, technical jargon, and other terminology that might be uncommon in general speech but important for your use case. The model uses this information to bias its vocabulary selection during transcription.

Notes

The context bias file is optional. Without it, transcriptions will still work using the standard model vocabulary.
The file must exist at the specified path; if not found, a warning will be displayed but transcription will continue without context biasing.
Context bias phrases are joined with underscores internally, so multi-word entries should use underscores (e.g., action_item).

Diarization and Timestamp Granularities

The Mistral transcription API supports speaker diarization and configurable timestamp granularities to enhance your transcriptions.

Diarization

Speaker diarization identifies different speakers in the audio and attributes speech segments to each speaker in the output.

Default: Enabled (diarize = TRUE)
Purpose: Useful for meetings with multiple participants, interviews, or any multi-speaker audio
Output: When enabled, transcription output includes speaker labels

Timestamp Granularities

Controls the granularity of timestamps in the transcription output.

Default: segment - provides timestamps for each speech segment
Options:
- segment - timestamps for each detected speech segment (most granular)
- word - timestamps for each word
- none - no timestamps (plain text output)

Important Note on Output Format

When timestamp granularity is set to anything other than none (i.e., segment or word), the transcription API returns output in JSON format instead of plain text. This JSON includes the timestamps and speaker information (if diarization is enabled). Your application handles this by writing the JSON directly to the transcription file. If you need plain text output, set timestamp granularities to none.

Setting Diarization and Timestamp Granularities

In TUI Mode:

Select "Settings" from the main menu
Select "Set Diarization" to enable/disable speaker diarization
Select "Set Timestamp Granularities" to choose between segment, word, or none
Settings are saved and applied to all subsequent transcriptions

In Command Line Mode:

# Enable diarization (default)
Rscript meeting_agent.R --record

# Disable diarization
Rscript meeting_agent.R --no-diarize --record

# Set timestamp granularities
Rscript meeting_agent.R --timestamp-granularities word --record

# Disable timestamps (plain text output)
Rscript meeting_agent.R --timestamp-granularities none --record

Workflows

Record Audio Workflow

Select "Record Audio" from main menu
Audio capture is set up using PulseAudio
Recording starts automatically
Press Ctrl+C when finished recording
Audio is transcribed using Mistral API
Meeting notes are generated and saved

Use Existing File Workflow

Select "Use Existing File" from main menu
Enter path to your audio/video file
File is transcribed using Mistral API
Meeting notes are generated and saved

Output

Each workflow run creates a new timestamped directory (format: YYYY-MM-DD_HH-MM-SS) under the configured output directory or current working directory. Output files are also named with the date and timestamp for easy identification:

recording.wav: Audio recording (if recorded live)
transcription.txt: Raw transcription
meeting_notes.md: Structured meeting notes in Markdown format

Configuration

On first run, the script creates a config.json file to store your API key and settings. This file is automatically loaded on subsequent runs.

Template Format

The meeting notes template should be a Markdown file with section headers. The AI will fill in content following the template structure. A default template meeting_blueprint.md is expected in the current directory.

License

This project is open-source and available for use under the GPLv3 License.