translator-ai
Fast and efficient JSON i18n translator supporting multiple AI providers (Google Gemini, OpenAI & Ollama/DeepSeek) with intelligent caching, multi-file deduplication, and MCP integration.
Features
- Multiple AI Providers: Choose between Google Gemini, OpenAI (cloud) or Ollama/DeepSeek (local) for translations
- Multi-File Support: Process multiple files with automatic deduplication to save API calls
- Incremental Caching: Only translates new or modified strings, dramatically reducing API calls
- Batch Processing: Intelligently batches translations for optimal performance
- Path Preservation: Maintains exact JSON structure including nested objects and arrays
- Cross-Platform: Works on Windows, macOS, and Linux with automatic cache directory detection
- Developer Friendly: Built-in performance statistics and progress indicators
- Cost Effective: Minimizes API usage through smart caching and deduplication
- Language Detection: Automatically detect source language instead of assuming English
- Multiple Target Languages: Translate to multiple languages in a single command
- Translation Metadata: Optionally include translation details in output files for tracking
- Dry Run Mode: Preview what would be translated without making API calls
- Format Preservation: Maintains URLs, emails, dates, numbers, and template variables unchanged
Installation
Global Installation (Recommended)
npm install -g translator-ai
Local Installation
npm install translator-ai
Configuration
Option 1: Google Gemini API (Cloud)
Create a .env file in your project root or set the environment variable:
GEMINI_API_KEY=your_gemini_api_key_here
Get your API key from Google AI Studio.
Option 2: OpenAI API (Cloud)
Create a .env file in your project root or set the environment variable:
OPENAI_API_KEY=your_openai_api_key_here
Get your API key from OpenAI Platform.
Option 3: Ollama with DeepSeek-R1 (Local)
For completely local translation without API costs:
- Install Ollama
- Pull the DeepSeek-R1 model:
ollama pull deepseek-r1:latest
- Use the
--provider ollamaflag:translator-ai source.json -l es -o spanish.json --provider ollama
Usage
Basic Usage
# Translate a single file translator-ai source.json -l es -o spanish.json # Translate multiple files with deduplication translator-ai src/locales/en/*.json -l es -o "{dir}/{name}.{lang}.json" # Use glob patterns translator-ai "src/**/*.en.json" -l fr -o "{dir}/{name}.fr.json"
Command Line Options
translator-ai <inputFiles...> [options]
Arguments:
inputFiles Path(s) to source JSON file(s) or glob patterns
Options:
-l, --lang <langCodes> Target language code(s), comma-separated for multiple
-o, --output <pattern> Output file path or pattern
--stdout Output to stdout instead of file
--stats Show detailed performance statistics
--no-cache Disable incremental translation cache
--cache-file <path> Custom cache file path
--provider <type> Translation provider: gemini, openai, or ollama (default: gemini)
--ollama-url <url> Ollama API URL (default: http://localhost:11434)
--ollama-model <model> Ollama model name (default: deepseek-r1:latest)
--gemini-model <model> Gemini model name (default: gemini-2.0-flash-lite)
--openai-model <model> OpenAI model name (default: gpt-4o-mini)
--list-providers List available translation providers
--verbose Enable verbose output for debugging
--detect-source Auto-detect source language instead of assuming English
--dry-run Preview what would be translated without making API calls
--preserve-formats Preserve URLs, emails, numbers, dates, and other formats
--metadata Add translation metadata to output files (may break some i18n parsers)
--sort-keys Sort output JSON keys alphabetically
--check-keys Verify all source keys exist in output (exit with error if keys are missing)
-h, --help Display help
-V, --version Display version
Output Pattern Variables (for multiple files):
{dir} - Original directory path
{name} - Original filename without extension
{lang} - Target language code
Examples
Translate a single file
translator-ai en.json -l es -o es.json
Translate multiple files with pattern
# All JSON files in a directory translator-ai locales/en/*.json -l es -o "locales/es/{name}.json" # Recursive glob pattern translator-ai "src/**/en.json" -l fr -o "{dir}/fr.json" # Multiple specific files translator-ai file1.json file2.json file3.json -l de -o "{name}.de.json"
Translate with deduplication savings
# Shows statistics including how many API calls were saved translator-ai src/i18n/*.json -l ja -o "{dir}/{name}.{lang}.json" --stats
Output to stdout (useful for piping)
translator-ai en.json -l de --stdout > de.jsonParse output with jq
translator-ai en.json -l de --stdout | jqDisable caching for fresh translation
translator-ai en.json -l ja -o ja.json --no-cache
Use custom cache location
translator-ai en.json -l ko -o ko.json --cache-file /path/to/cache.json
Use Ollama for local translation
# Basic usage with Ollama translator-ai en.json -l es -o es.json --provider ollama # Use a different Ollama model translator-ai en.json -l fr -o fr.json --provider ollama --ollama-model llama2:latest # Connect to remote Ollama instance translator-ai en.json -l de -o de.json --provider ollama --ollama-url http://192.168.1.100:11434 # Check available providers translator-ai --list-providers
Advanced Features
# Detect source language automatically translator-ai content.json -l es -o spanish.json --detect-source # Translate to multiple languages at once translator-ai en.json -l es,fr,de,ja -o translations/{lang}.json # Dry run - see what would be translated without making API calls translator-ai en.json -l es -o es.json --dry-run # Preserve formats (URLs, emails, dates, numbers, template variables) translator-ai app.json -l fr -o app-fr.json --preserve-formats # Include translation metadata (disabled by default to ensure compatibility) translator-ai en.json -l fr -o fr.json --metadata # Sort keys alphabetically for consistent output translator-ai en.json -l fr -o fr.json --sort-keys # Verify all keys are present in the translation translator-ai en.json -l fr -o fr.json --check-keys # Use a different Gemini model translator-ai en.json -l es -o es.json --gemini-model gemini-2.5-flash # Combine features translator-ai src/**/*.json -l es,fr,de -o "{dir}/{name}.{lang}.json" \ --detect-source --preserve-formats --stats --check-keys
Available Gemini Models
The --gemini-model option allows you to choose from various Gemini models. Popular options include:
-
gemini-2.0-flash-lite(default) - Fast and efficient for most translations -
gemini-2.5-flash- Enhanced performance with newer capabilities -
gemini-pro- More sophisticated understanding for complex translations -
gemini-1.5-pro- Previous generation pro model -
gemini-1.5-flash- Previous generation fast model
Example usage:
# Use the latest flash model translator-ai en.json -l es -o es.json --gemini-model gemini-2.5-flash # Use the default lightweight model translator-ai en.json -l fr -o fr.json --gemini-model gemini-2.0-flash-lite
Available OpenAI Models
The --openai-model option allows you to choose from various OpenAI models. Popular options include:
-
gpt-4o-mini(default) - Cost-effective and fast for most translations -
gpt-4o- Most capable model with advanced understanding -
gpt-4-turbo- Previous generation flagship model -
gpt-3.5-turbo- Fast and efficient for simpler translations
Example usage:
# Use OpenAI with the default model translator-ai en.json -l es -o es.json --provider openai # Use GPT-4o for complex translations translator-ai en.json -l ja -o ja.json --provider openai --openai-model gpt-4o # Use GPT-3.5-turbo for faster, simpler translations translator-ai en.json -l fr -o fr.json --provider openai --openai-model gpt-3.5-turbo
Translation Metadata
When enabled with the --metadata flag, translator-ai adds metadata to help track translations:
{
"_translator_metadata": {
"tool": "translator-ai v1.1.0",
"repository": "https://github.com/DatanoiseTV/translator-ai",
"provider": "Google Gemini",
"source_language": "English",
"target_language": "fr",
"timestamp": "2025-06-20T12:34:56.789Z",
"total_strings": 42,
"source_file": "en.json"
},
"greeting": "Bonjour",
"farewell": "Au revoir"
}Metadata is disabled by default to ensure compatibility with i18n parsers. Use --metadata to enable it.
Key Sorting
Use the --sort-keys flag to sort all JSON keys alphabetically in the output:
translator-ai en.json -l es -o es.json --sort-keys
This ensures consistent ordering across translations and makes diffs cleaner. Keys are sorted:
- Case-insensitively (a, B, c, not B, a, c)
- Recursively through all nested objects
- Arrays maintain their element order
Key Verification
Use the --check-keys flag to ensure translation completeness:
translator-ai en.json -l es -o es.json --check-keys
This feature:
- Verifies all source keys exist in the translated output
- Reports any missing keys with their full paths
- Exits with error code 1 if any keys are missing
- Helps catch translation API failures or formatting issues
- Ignores metadata keys when checking
Supported Language Codes
It should support any standardized language codes.
How It Works
- Parsing: Reads and flattens your JSON structure into paths
- Deduplication: When processing multiple files, identifies shared strings
- Caching: Checks cache for previously translated strings
- Diffing: Identifies new or modified strings needing translation
- Batching: Groups unique strings into optimal batch sizes for API efficiency
- Translation: Sends batches to selected provider (Gemini API or local Ollama)
- Reconstruction: Rebuilds the exact JSON structure with translations
- Caching: Updates cache with new translations for future use
Multi-File Deduplication
When translating multiple files, translator-ai automatically:
- Identifies duplicate strings across files
- Translates each unique string only once
- Applies the same translation consistently across all files
- Saves significant API calls and ensures consistency
Example: If 10 files share 50% of their strings, you save ~50% on API calls!
Cache Management
Default Cache Locations
-
Windows:
%APPDATA%\translator-ai\translation-cache.json -
macOS:
~/Library/Caches/translator-ai/translation-cache.json -
Linux:
~/.cache/translator-ai/translation-cache.json
The cache file stores translations indexed by:
- Source file path
- Target language
- SHA-256 hash of source string
This ensures that:
- Modified strings are retranslated
- Removed strings are pruned from cache
- Multiple projects can share the same cache without conflicts
Provider Comparison
Google Gemini
- Pros: Fast, accurate, handles large batches efficiently
- Cons: Requires API key, has usage costs
-
Available Models:
-
gemini-2.0-flash-lite(default) - Fastest, most cost-effective -
gemini-pro- Balanced performance -
gemini-1.5-pro- Advanced capabilities -
gemini-1.5-flash- Fast with good quality
-
- Best for: Production use, large projects, when accuracy is critical
Ollama (Local)
- Pros: Free, runs locally, no API limits, privacy-friendly
- Cons: Slower, requires local resources, model download needed
- Best for: Development, privacy-sensitive data, cost-conscious projects
Performance Tips
- Use caching (enabled by default) to minimize API calls
- Batch multiple files in the same session to leverage warm cache
-
Use
--statsflag to monitor performance and optimization opportunities - Keep source files consistent to maximize cache hits
- For Ollama: Use a powerful machine for better performance
API Limits and Costs
Gemini API
- Uses Gemini 2.0 Flash Lite model for optimal speed and cost
- Chooses best batch size dynamically depending on input key count
- Batches up to 100 strings per API call
- Check Google's pricing for current rates
Ollama
- No API costs - runs entirely on your hardware
- Performance depends on your machine's capabilities
- Supports various models with different speed/quality tradeoffs
Using with Model Context Protocol (MCP)
translator-ai can be used as an MCP server, allowing AI assistants like Claude Desktop to translate files directly.
MCP Configuration
Add to your Claude Desktop configuration:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"translator-ai": {
"command": "npx",
"args": [
"-y",
"translator-ai-mcp"
],
"env": {
"GEMINI_API_KEY": "your-gemini-api-key-here"
// Or for Ollama:
// "TRANSLATOR_PROVIDER": "ollama"
}
}
}
}MCP Usage Examples
Once configured, you can ask Claude to translate files:
Human: Can you translate my English locale file to Spanish?
Claude: I'll translate your English locale file to Spanish using translator-ai.
<use_tool name="translate_json">
{
"inputFile": "locales/en.json",
"targetLanguage": "es",
"outputFile": "locales/es.json"
}
</use_tool>
Successfully translated! The file has been saved to locales/es.json.
For multiple files with deduplication:
Human: Translate all my English JSON files in the locales folder to German.
Claude: I'll translate all your English JSON files to German with deduplication.
<use_tool name="translate_multiple">
{
"pattern": "locales/en/*.json",
"targetLanguage": "de",
"outputPattern": "locales/de/{name}.json",
"showStats": true
}
</use_tool>
Translation complete! Processed 5 files with 23% deduplication savings.
MCP Tools Available
-
translate_json: Translate a single JSON file
-
inputFile: Path to source file -
targetLanguage: Target language code -
outputFile: Output file path
-
-
translate_multiple: Translate multiple files with deduplication
-
pattern: File pattern or paths -
targetLanguage: Target language code -
outputPattern: Output pattern with {dir}, {name}, {lang} variables -
showStats: Show deduplication statistics (optional)
-
Integration with Static Site Generators
Working with YAML Files (Hugo, Jekyll, etc.)
Since translator-ai works with JSON files, you'll need to convert YAML to JSON and back. Here's a practical workflow:
Setup YAML conversion tools
# Install yaml conversion tools npm install -g js-yaml # or pip install pyyaml
Hugo Example with YAML Conversion
-
Create a translation script (
translate-hugo.sh):
#!/bin/bash # translate-hugo.sh - Translate Hugo YAML i18n files # Function to translate YAML file translate_yaml() { local input_file=$1 local lang=$2 local output_file=$3 echo "Translating $input_file to $lang..." # Convert YAML to JSON npx js-yaml $input_file > temp_input.json # Translate JSON translator-ai temp_input.json -l $lang -o temp_output.json # Convert back to YAML npx js-yaml temp_output.json > $output_file # Cleanup rm temp_input.json temp_output.json } # Translate Hugo i18n files translate_yaml themes/your-theme/i18n/en.yaml es themes/your-theme/i18n/es.yaml translate_yaml themes/your-theme/i18n/en.yaml fr themes/your-theme/i18n/fr.yaml translate_yaml themes/your-theme/i18n/en.yaml de themes/your-theme/i18n/de.yaml
- Python-based converter for more complex scenarios:
#!/usr/bin/env python3 # hugo-translate.py import yaml import json import subprocess import sys import os def yaml_to_json(yaml_file): """Convert YAML to JSON""" with open(yaml_file, 'r', encoding='utf-8') as f: data = yaml.safe_load(f) return json.dumps(data, ensure_ascii=False, indent=2) def json_to_yaml(json_str): """Convert JSON back to YAML""" data = json.loads(json_str) return yaml.dump(data, allow_unicode=True, default_flow_style=False) def translate_yaml_file(input_yaml, target_lang, output_yaml): """Translate a YAML file using translator-ai""" # Create temp JSON file temp_json_in = 'temp_in.json' temp_json_out = f'temp_out_{target_lang}.json' try: # Convert YAML to JSON json_content = yaml_to_json(input_yaml) with open(temp_json_in, 'w', encoding='utf-8') as f: f.write(json_content) # Run translator-ai cmd = [ 'translator-ai', temp_json_in, '-l', target_lang, '-o', temp_json_out ] subprocess.run(cmd, check=True) # Read translated JSON and convert back to YAML with open(temp_json_out, 'r', encoding='utf-8') as f: translated_json = f.read() yaml_content = json_to_yaml(translated_json) # Write YAML output with open(output_yaml, 'w', encoding='utf-8') as f: f.write(yaml_content) print(f"✓ Translated {input_yaml} to {output_yaml}") finally: # Cleanup temp files for f in [temp_json_in, temp_json_out]: if os.path.exists(f): os.remove(f) # Usage if __name__ == "__main__": languages = ['es', 'fr', 'de', 'ja'] for lang in languages: translate_yaml_file( 'i18n/en.yaml', lang, f'i18n/{lang}.yaml' )
Node.js Solution with Proper YAML Handling
Create translate-yaml.js:
#!/usr/bin/env node const fs = require('fs'); const yaml = require('js-yaml'); const { execSync } = require('child_process'); const path = require('path'); function translateYamlFile(inputPath, targetLang, outputPath) { console.log(`Translating ${inputPath} to ${targetLang}...`); // Read and parse YAML const yamlContent = fs.readFileSync(inputPath, 'utf8'); const data = yaml.load(yamlContent); // Write temporary JSON const tempJsonIn = `temp_${path.basename(inputPath)}.json`; const tempJsonOut = `temp_${path.basename(inputPath)}_${targetLang}.json`; fs.writeFileSync(tempJsonIn, JSON.stringify(data, null, 2)); try { // Translate using translator-ai execSync(`translator-ai ${tempJsonIn} -l ${targetLang} -o ${tempJsonOut}`); // Read translated JSON const translatedData = JSON.parse(fs.readFileSync(tempJsonOut, 'utf8')); // Convert back to YAML const translatedYaml = yaml.dump(translatedData, { indent: 2, lineWidth: -1, noRefs: true }); // Write output YAML fs.writeFileSync(outputPath, translatedYaml); console.log(`✓ Created ${outputPath}`); } finally { // Cleanup [tempJsonIn, tempJsonOut].forEach(f => { if (fs.existsSync(f)) fs.unlinkSync(f); }); } } // Example usage const languages = ['es', 'fr', 'de']; languages.forEach(lang => { translateYamlFile( 'i18n/en.yaml', lang, `i18n/${lang}.yaml` ); });
Real-world Hugo Workflow
Hugo supports two translation methods: by filename (about.en.md, about.fr.md) or by content directory (content/en/, content/fr/). Here's how to automate both:
Method 1: Translation by Filename
Create hugo-translate-files.sh:
#!/bin/bash # Translate Hugo content files using filename convention SOURCE_LANG="en" TARGET_LANGS=("es" "fr" "de" "ja") # Find all English content files find content -name "*.${SOURCE_LANG}.md" | while read -r file; do # Extract base filename without language suffix base_name="${file%.${SOURCE_LANG}.md}" for lang in "${TARGET_LANGS[@]}"; do output_file="${base_name}.${lang}.md" # Skip if translation already exists if [ -f "$output_file" ]; then echo "Skipping $output_file (already exists)" continue fi # Extract front matter awk '/^---$/{p=1; next} p&&/^---$/{exit} p' "$file" > temp_frontmatter.yaml # Convert front matter to JSON npx js-yaml temp_frontmatter.yaml > temp_frontmatter.json # Translate front matter translator-ai temp_frontmatter.json -l "$lang" -o "temp_translated.json" # Convert back to YAML echo "---" > "$output_file" npx js-yaml temp_translated.json >> "$output_file" echo "---" >> "$output_file" # Copy content (you might want to translate this too) awk '/^---$/{p++} p==2{print}' "$file" | tail -n +2 >> "$output_file" echo "Created $output_file" done # Cleanup rm -f temp_frontmatter.yaml temp_frontmatter.json temp_translated.json done
Method 2: Translation by Content Directory
-
Setup Hugo config (
config.yaml):
defaultContentLanguage: en defaultContentLanguageInSubdir: false languages: en: contentDir: content/en languageName: English weight: 1 es: contentDir: content/es languageName: Español weight: 2 fr: contentDir: content/fr languageName: Français weight: 3 # Rest of your config...
-
Create translation script (
hugo-translate-dirs.js):
#!/usr/bin/env node const fs = require('fs-extra'); const path = require('path'); const yaml = require('js-yaml'); const { execSync } = require('child_process'); const glob = require('glob'); const SOURCE_LANG = 'en'; const TARGET_LANGS = ['es', 'fr', 'de']; async function translateHugoContent() { // Ensure target directories exist for (const lang of TARGET_LANGS) { await fs.ensureDir(`content/${lang}`); } // Find all content files in source language const files = glob.sync(`content/${SOURCE_LANG}/**/*.md`); for (const file of files) { const relativePath = path.relative(`content/${SOURCE_LANG}`, file); for (const lang of TARGET_LANGS) { const targetFile = path.join(`content/${lang}`, relativePath); // Skip if already translated if (await fs.pathExists(targetFile)) { console.log(`Skipping ${targetFile} (exists)`); continue; } await translateFile(file, targetFile, lang); } } } async function translateFile(sourceFile, targetFile, targetLang) { console.log(`Translating ${sourceFile} to ${targetLang}...`); const content = await fs.readFile(sourceFile, 'utf8'); const frontMatterMatch = content.match(/^---\n([\s\S]*?)\n---/); if (!frontMatterMatch) { // No front matter, just copy await fs.ensureDir(path.dirname(targetFile)); await fs.copyFile(sourceFile, targetFile); return; } // Parse front matter const frontMatter = yaml.load(frontMatterMatch[1]); const body = content.substring(frontMatterMatch[0].length); // Extract translatable fields const translatable = { title: frontMatter.title || '', description: frontMatter.description || '', summary: frontMatter.summary || '', keywords: frontMatter.keywords || [] }; // Save for translation await fs.writeJson('temp_meta.json', translatable); // Translate execSync(`translator-ai temp_meta.json -l ${targetLang} -o temp_translated.json`); // Read translations const translated = await fs.readJson('temp_translated.json'); // Update front matter Object.assign(frontMatter, translated); // Write translated file await fs.ensureDir(path.dirname(targetFile)); const newContent = `---\n${yaml.dump(frontMatter)}---${body}`; await fs.writeFile(targetFile, newContent); // Cleanup await fs.remove('temp_meta.json'); await fs.remove('temp_translated.json'); console.log(`✓ Created ${targetFile}`); } // Run translation translateHugoContent().catch(console.error);
Hugo i18n Files Translation
- Install dependencies:
npm install -g translator-ai js-yaml
- Create a Makefile for easy translation:
# Makefile for Hugo translations LANGUAGES := es fr de ja zh SOURCE_YAML := i18n/en.yaml THEME_DIR := themes/your-theme .PHONY: translate translate: $(foreach lang,$(LANGUAGES),translate-$(lang)) translate-%: @echo "Translating to $*..." @npx js-yaml $(SOURCE_YAML) > temp.json @translator-ai temp.json -l $* -o temp_$*.json @npx js-yaml temp_$*.json > i18n/$*.yaml @rm temp.json temp_$*.json @echo "✓ Created i18n/$*.yaml" .PHONY: translate-theme translate-theme: @for lang in $(LANGUAGES); do \ make translate-theme-$$lang; \ done translate-theme-%: @echo "Translating theme to $*..." @npx js-yaml $(THEME_DIR)/i18n/en.yaml > temp_theme.json @translator-ai temp_theme.json -l $* -o temp_theme_$*.json @npx js-yaml temp_theme_$*.json > $(THEME_DIR)/i18n/$*.yaml @rm temp_theme.json temp_theme_$*.json .PHONY: clean clean: @rm -f temp*.json # Translate everything .PHONY: all all: translate translate-theme
Usage:
# Translate to all languages make all # Translate to specific language make translate-es # Translate theme files make translate-theme
Complete Hugo Translation Workflow
Here's a comprehensive script that handles both content and i18n translations:
#!/usr/bin/env node // hugo-complete-translator.js const fs = require('fs-extra'); const path = require('path'); const yaml = require('js-yaml'); const { execSync } = require('child_process'); const glob = require('glob'); class HugoTranslator { constructor(targetLanguages = ['es', 'fr', 'de']) { this.targetLanguages = targetLanguages; this.tempFiles = []; } async translateSite() { console.log('Starting Hugo site translation...\n'); // 1. Translate i18n files await this.translateI18nFiles(); // 2. Translate content await this.translateContent(); // 3. Update config await this.updateConfig(); console.log('\nTranslation complete!'); } async translateI18nFiles() { console.log('Translating i18n files...'); const i18nFiles = glob.sync('i18n/en.{yaml,yml,toml}'); for (const file of i18nFiles) { const ext = path.extname(file); for (const lang of this.targetLanguages) { const outputFile = `i18n/${lang}${ext}`; if (await fs.pathExists(outputFile)) { console.log(` Skipping ${outputFile} (exists)`); continue; } // Convert to JSON const tempJson = `temp_i18n_${lang}.json`; await this.convertToJson(file, tempJson); // Translate const translatedJson = `temp_i18n_${lang}_translated.json`; execSync(`translator-ai ${tempJson} -l ${lang} -o ${translatedJson}`); // Convert back await this.convertFromJson(translatedJson, outputFile, ext); // Cleanup await fs.remove(tempJson); await fs.remove(translatedJson); console.log(` ✓ Created ${outputFile}`); } } } async translateContent() { console.log('\nTranslating content...'); // Detect translation method const useContentDirs = await fs.pathExists('content/en'); if (useContentDirs) { await this.translateContentByDirectory(); } else { await this.translateContentByFilename(); } } async translateContentByDirectory() { const files = glob.sync('content/en/**/*.md'); for (const file of files) { const relativePath = path.relative('content/en', file); for (const lang of this.targetLanguages) { const targetFile = path.join('content', lang, relativePath); if (await fs.pathExists(targetFile)) continue; await this.translateMarkdownFile(file, targetFile, lang); } } } async translateContentByFilename() { const files = glob.sync('content/**/*.en.md'); for (const file of files) { const baseName = file.replace('.en.md', ''); for (const lang of this.targetLanguages) { const targetFile = `${baseName}.${lang}.md`; if (await fs.pathExists(targetFile)) continue; await this.translateMarkdownFile(file, targetFile, lang); } } } async translateMarkdownFile(sourceFile, targetFile, targetLang) { const content = await fs.readFile(sourceFile, 'utf8'); const frontMatterMatch = content.match(/^---\n([\s\S]*?)\n---/); if (!frontMatterMatch) { await fs.copy(sourceFile, targetFile); return; } const frontMatter = yaml.load(frontMatterMatch[1]); const body = content.substring(frontMatterMatch[0].length); // Translate front matter const translatable = this.extractTranslatableFields(frontMatter); const tempJson = `temp_content_${path.basename(sourceFile)}.json`; const translatedJson = `${tempJson}.translated`; await fs.writeJson(tempJson, translatable); execSync(`translator-ai ${tempJson} -l ${targetLang} -o ${translatedJson}`); const translated = await fs.readJson(translatedJson); Object.assign(frontMatter, translated); // Write translated file await fs.ensureDir(path.dirname(targetFile)); const newContent = `---\n${yaml.dump(frontMatter)}---${body}`; await fs.writeFile(targetFile, newContent); // Cleanup await fs.remove(tempJson); await fs.remove(translatedJson); console.log(` ✓ ${targetFile}`); } extractTranslatableFields(frontMatter) { const fields = ['title', 'description', 'summary', 'keywords', 'tags']; const translatable = {}; fields.forEach(field => { if (frontMatter[field]) { translatable[field] = frontMatter[field]; } }); return translatable; } async convertToJson(inputFile, outputFile) { const ext = path.extname(inputFile); const content = await fs.readFile(inputFile, 'utf8'); let data; if (ext === '.yaml' || ext === '.yml') { data = yaml.load(content); } else if (ext === '.toml') { // You'd need a TOML parser here throw new Error('TOML support not implemented in this example'); } await fs.writeJson(outputFile, data, { spaces: 2 }); } async convertFromJson(inputFile, outputFile, format) { const data = await fs.readJson(inputFile); let content; if (format === '.yaml' || format === '.yml') { content = yaml.dump(data, { indent: 2, lineWidth: -1, noRefs: true }); } else if (format === '.toml') { throw new Error('TOML support not implemented in this example'); } await fs.writeFile(outputFile, content); } async updateConfig() { console.log('\nUpdating Hugo config...'); const configFile = glob.sync('config.{yaml,yml,toml,json}')[0]; if (!configFile) return; // This is a simplified example - you'd need to properly parse and update console.log(' ! Remember to update your config.yaml with language settings'); } } // Run the translator if (require.main === module) { const translator = new HugoTranslator(['es', 'fr', 'de']); translator.translateSite().catch(console.error); } module.exports = HugoTranslator;
Using with Hugo Modules
If you're using Hugo Modules, you can create a translation module:
// go.mod module github.com/yourusername/hugo-translator go 1.19 require ( github.com/yourusername/your-theme v1.0.0 )
Then in your package.json:
{
"scripts": {
"translate": "node hugo-complete-translator.js",
"translate:content": "node hugo-complete-translator.js --content-only",
"translate:i18n": "node hugo-complete-translator.js --i18n-only",
"build": "npm run translate && hugo"
}
}Jekyll with YAML Front Matter
For Jekyll posts with YAML front matter:
#!/usr/bin/env python3 # translate-jekyll-posts.py import os import yaml import json import subprocess import frontmatter def translate_jekyll_post(post_path, target_lang, output_dir): """Translate Jekyll post including front matter""" # Load post with front matter post = frontmatter.load(post_path) # Extract translatable front matter fields translatable = { 'title': post.metadata.get('title', ''), 'description': post.metadata.get('description', ''), 'excerpt': post.metadata.get('excerpt', '') } # Save as JSON for translation with open('temp_meta.json', 'w', encoding='utf-8') as f: json.dump(translatable, f, ensure_ascii=False, indent=2) # Translate subprocess.run([ 'translator-ai', 'temp_meta.json', '-l', target_lang, '-o', f'temp_meta_{target_lang}.json' ]) # Load translations with open(f'temp_meta_{target_lang}.json', 'r', encoding='utf-8') as f: translations = json.load(f) # Update post metadata for key, value in translations.items(): if value: # Only update if translation exists post.metadata[key] = value # Add language to metadata post.metadata['lang'] = target_lang # Save translated post output_path = os.path.join(output_dir, os.path.basename(post_path)) with open(output_path, 'w', encoding='utf-8') as f: f.write(frontmatter.dumps(post)) # Cleanup os.remove('temp_meta.json') os.remove(f'temp_meta_{target_lang}.json') # Translate all posts for lang in ['es', 'fr', 'de']: os.makedirs(f'_posts/{lang}', exist_ok=True) for post in os.listdir('_posts/en'): if post.endswith('.md'): translate_jekyll_post( f'_posts/en/{post}', lang, f'_posts/{lang}' )
Tips for YAML/JSON Conversion
-
Preserve formatting: Use
js-yamlwith proper options to maintain YAML structure - Handle special characters: Ensure proper encoding (UTF-8) throughout
- Validate output: Some YAML features (anchors, aliases) may need special handling
- Consider TOML: For Hugo, you might also need to handle TOML config files
Alternative: Direct YAML Support (Feature Request)
If you frequently work with YAML files, consider creating a wrapper script that handles conversion automatically, or request YAML support as a feature for translator-ai.
Development
Building from source
git clone https://github.com/DatanoiseTV/translator-ai.git
cd translator-ai
npm install
npm run buildTesting locally
npm start -- test.json -l es -o output.json
License
This project requires attribution for both commercial and non-commercial use. See LICENSE file for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Support
For issues, questions, or suggestions, please open an issue on GitHub.
If you find this tool useful, consider supporting the development: