GitHub - cydanix/go-gradium: Golang client library for Gradium.AI TTS/STT API

2 min read Original article ↗

Go client library for Gradium AI Text-to-Speech (TTS) and Speech-to-Text (STT) WebSocket APIs.

Installation

go get github.com/cydanix/go-gradium

Usage

Text-to-Speech (TTS)

Convert text to speech audio (PCM format, base64-encoded):

package main

import (
	"os"

	gradium "github.com/cydanix/go-gradium"
	"github.com/cydanix/go-gradium/tts"
	"go.uber.org/zap"
)

func main() {
	log, _ := zap.NewProduction()

	t := gradium.NewTTS(&tts.TTSConfig{
		Endpoint: "wss://us.api.gradium.ai/api/speech/tts",
		VoiceID:  "LFZvm12tW_z0xfGo",
		ApiKey:   os.Getenv("GRADIUM_API_KEY"),
		Log:      log,
	})

	if err := t.Start(); err != nil {
		log.Fatal("failed to start TTS", zap.Error(err))
	}
	defer t.Shutdown()

	// Send text to convert
	t.Process("Hello, world!")

	// Retrieve generated audio chunks (base64-encoded PCM)
	audioChunks := t.GetSpeech(100)
	for _, chunk := range audioChunks {
		// Process audio chunk...
		_ = chunk
	}
}

Speech-to-Text (STT)

Transcribe audio (PCM format, base64-encoded) to text:

package main

import (
	"os"

	gradium "github.com/cydanix/go-gradium"
	"github.com/cydanix/go-gradium/stt"
	"go.uber.org/zap"
)

func main() {
	log, _ := zap.NewProduction()

	s := gradium.NewSTT(&stt.STTConfig{
		Endpoint: "wss://us.api.gradium.ai/api/speech/asr",
		ApiKey:   os.Getenv("GRADIUM_API_KEY"),
		Log:      log,
	})

	if err := s.Start(); err != nil {
		log.Fatal("failed to start STT", zap.Error(err))
	}
	defer s.Shutdown()

	// Send audio data (base64-encoded PCM, 24kHz sample rate)
	s.Process(audioBase64)

	// Retrieve transcribed text
	texts := s.GetText(100)
	for _, text := range texts {
		// Process transcribed text...
		_ = text
	}
}

TTS + STT Pipeline

See tts_stt_test.go for a complete example of piping TTS output through STT for transcription.

Configuration

TTSConfig

Field Description
Endpoint WebSocket endpoint URL
VoiceID Voice identifier for speech synthesis
ApiKey Gradium API key
Log zap.Logger instance

STTConfig

Field Description
Endpoint WebSocket endpoint URL
ApiKey Gradium API key
Log zap.Logger instance

Audio Format

  • TTS output: PCM audio, 48kHz sample rate, 16-bit signed little-endian, base64-encoded
  • STT input: PCM audio, 24kHz sample rate, 16-bit signed little-endian, base64-encoded

Running Tests

Set your Gradium API key and run tests:

export GRADIUM_API_KEY="your-api-key"
go test -v ./...

Run a specific test:

export GRADIUM_API_KEY="your-api-key"
go test -v -run TestTTS
go test -v -run TestSTT
go test -v -run TestTTSAndSTT

License

MIT