Go client library for Gradium AI Text-to-Speech (TTS) and Speech-to-Text (STT) WebSocket APIs.
Installation
go get github.com/cydanix/go-gradium
Usage
Text-to-Speech (TTS)
Convert text to speech audio (PCM format, base64-encoded):
package main import ( "os" gradium "github.com/cydanix/go-gradium" "github.com/cydanix/go-gradium/tts" "go.uber.org/zap" ) func main() { log, _ := zap.NewProduction() t := gradium.NewTTS(&tts.TTSConfig{ Endpoint: "wss://us.api.gradium.ai/api/speech/tts", VoiceID: "LFZvm12tW_z0xfGo", ApiKey: os.Getenv("GRADIUM_API_KEY"), Log: log, }) if err := t.Start(); err != nil { log.Fatal("failed to start TTS", zap.Error(err)) } defer t.Shutdown() // Send text to convert t.Process("Hello, world!") // Retrieve generated audio chunks (base64-encoded PCM) audioChunks := t.GetSpeech(100) for _, chunk := range audioChunks { // Process audio chunk... _ = chunk } }
Speech-to-Text (STT)
Transcribe audio (PCM format, base64-encoded) to text:
package main import ( "os" gradium "github.com/cydanix/go-gradium" "github.com/cydanix/go-gradium/stt" "go.uber.org/zap" ) func main() { log, _ := zap.NewProduction() s := gradium.NewSTT(&stt.STTConfig{ Endpoint: "wss://us.api.gradium.ai/api/speech/asr", ApiKey: os.Getenv("GRADIUM_API_KEY"), Log: log, }) if err := s.Start(); err != nil { log.Fatal("failed to start STT", zap.Error(err)) } defer s.Shutdown() // Send audio data (base64-encoded PCM, 24kHz sample rate) s.Process(audioBase64) // Retrieve transcribed text texts := s.GetText(100) for _, text := range texts { // Process transcribed text... _ = text } }
TTS + STT Pipeline
See tts_stt_test.go for a complete example of piping TTS output through STT for transcription.
Configuration
TTSConfig
| Field | Description |
|---|---|
| Endpoint | WebSocket endpoint URL |
| VoiceID | Voice identifier for speech synthesis |
| ApiKey | Gradium API key |
| Log | zap.Logger instance |
STTConfig
| Field | Description |
|---|---|
| Endpoint | WebSocket endpoint URL |
| ApiKey | Gradium API key |
| Log | zap.Logger instance |
Audio Format
- TTS output: PCM audio, 48kHz sample rate, 16-bit signed little-endian, base64-encoded
- STT input: PCM audio, 24kHz sample rate, 16-bit signed little-endian, base64-encoded
Running Tests
Set your Gradium API key and run tests:
export GRADIUM_API_KEY="your-api-key" go test -v ./...
Run a specific test:
export GRADIUM_API_KEY="your-api-key" go test -v -run TestTTS go test -v -run TestSTT go test -v -run TestTTSAndSTT
License
MIT