Removing Gemini AI Watermarks from Images & Veo Videos: A Deep Dive into Reverse Alpha Blending

14 min read Original article ↗

Allen Kuo (kwyshell)

An open-source toolkit to cleanly remove watermarks from your AI-generated images — and now Veo videos too

Press enter or click to view image in full size

The Problem: Beautiful Images, Annoying Watermarks

If you’ve been using Google’s Gemini AI for image generation — whether it’s Gemini Nano, Gemini Flash, or Gemini Pro — you’ve probably noticed something: every generated image comes with a semi-transparent watermark in the bottom-right corner.

Don’t get me wrong. I understand why Google adds these watermarks. Transparency about AI-generated content is important. But there are legitimate scenarios where these watermarks become a real headache:

  • Presentations — You’re preparing slides for a business meeting, and that watermark screams “I used AI for this”
  • Design mockups — The watermark clashes with your carefully crafted layout
  • Personal creative projects — You just want a clean image for your mood board or concept art
  • Social media content — The watermark distracts from the visual story you’re trying to tell

I found myself manually editing these watermarks out in Photoshop, one image at a time. After the tenth image, I thought: there has to be a better way. And with Google’s Veo now adding similar visible watermarks to AI-generated videos — where manual frame-by-frame editing is practically impossible — the problem is only getting worse.

So I built a solution — first for images, and now for video.

Introducing Gemini Watermark Tool

Gemini Watermark Tool is a lightweight, standalone command-line utility that removes Gemini watermarks from images — accurately and efficiently. Available as both a graphical desktop application and a command-line tool, in a single executable.

🖥️ GUI Mode — New!

Press enter or click to view image in full size

New GUI Mode

The tool now includes a full graphical interface. No command line needed — just open, drag & drop, and process.

  • Single image editing — drag & drop an image, one-key process (X), instant before/after comparison (V)
  • Custom watermark mode — draw and resize the watermark region interactively for non-standard positions
  • Batch processing — drop multiple files or an entire folder, thumbnail preview with progress tracking
  • Smart detection — three-stage watermark detection with confidence scoring, automatically skips images without watermarks (threshold adjustable via slider)

Press enter or click to view image in full size

Batch Mode

🚀 New Features:

AI Agent Integration (New in v0.2.5)

With the rise of AI-assisted workflows and autonomous agents, I wanted Gemini Watermark Tool to be more than just a standalone utility.

allenk/gwt-integrations: Claude Code Skill and MCP Server for GeminiWatermarkTool

In v0.2.5, the project introduces AI agent integration through:

  • Claude Code Skill
  • MCP (Model Context Protocol) server

This allows the tool to be used as a native processing component inside agent-driven workflows.

Instead of manually running commands or editing images one by one, an AI agent can now:

• detect images that contain Gemini watermarks
• call the tool automatically
• process the image using deterministic reverse alpha reconstruction
• optionally apply AI denoise cleanup

The result is a fully automated pipeline where AI handles orchestration, while the native engine performs precise image processing.

This hybrid model works particularly well because the watermark reconstruction itself is deterministic and mathematically accurate, while AI is used only where it adds value — residual artifact cleanup.

Below is a quick demo of the MCP integration running inside a Claude Code environment.

Press enter or click to view image in full size

Example: Claude Code using MCP to automate watermark remova

AI agents orchestrate the workflow.
Gemini Watermark Tool performs the precise image reconstruction.

AI In-painting — Neural Network Cleanup (v0.2.5)

Reverse alpha blending is mathematically exact on unmodified images. But in practice, many images get resized, recompressed, or screenshot-captured before you try to remove the watermark. That post-processing destroys the pixel-perfect relationship between the alpha map and the image, leaving faint residual artifacts — especially along the sparkle edges and the four corner tips of the watermark shape.

Software Inpainting (NS, TELEA, Soft Inpaint) helped, but these methods operate on hand-crafted rules: fluid dynamics equations, distance-weighted interpolation, Gaussian blending. None of them actually understand what a clean image should look like.

AI Denoise changes this. It uses FDnCNN — a 20-layer convolutional neural network trained on hundreds of thousands of image pairs — to learn the difference between “normal image content” and “something that shouldn’t be there.” The model runs on your GPU via NCNN + Vulkan, processing each watermark region in under 10ms.

Only the edge pixels where the math broke down get replaced by the AI result. Everything else stays untouched. Two parameters control the behavior: — Sigma (1–150): tells the network how much “noise” to expect. The default (sigma=50, strength=120%) handles most resized watermarks cleanly.

The model is only 1.3 MB (FP16 quantized) and embedded directly in the executable — no external files to distribute. AI Denoise is now the recommended default when available. If your GPU doesn’t support Vulkan, it falls back to CPU inference (~20ms) or the NS method automatically

Press enter or click to view image in full size

AI In-painting

🔧 CLI Mode

Press enter or click to view image in full size

Gemini WaterMark Tool

Press enter or click to view image in full size

Perfectly restored images. Even the most challenging text is no problem

Key Features

  • One-click removal — Drag and drop an image onto the executable, done
  • Batch processing — Process hundreds of images in seconds
  • Mathematically accurate — Uses reverse alpha blending, not crude inpainting
  • Zero dependencies — Single .exe file, no installation required
  • Open source — MIT licensed, free to use and modify

How to Use It

The Simplest Way: Drag & Drop

  1. Download GeminiWatermarkTool.exe from the GitHub releases page
  2. Drag your watermarked image onto the executable
  3. That’s it — the watermark is removed in-place

Command Line Usage

For more control, use the command line:

# Remove watermark, save to new file
GeminiWatermarkTool.exe -i watermarked.jpg -o clean.jpg
# Process an entire folder
GeminiWatermarkTool.exe -i ./input_folder/ -o ./output_folder/
# Force specific watermark size (if auto-detection fails)
GeminiWatermarkTool.exe -i image.jpg -o output.jpg --force-small

Watermark Size Detection

Gemini uses different watermark sizes based on image dimensions:

Image Dimensions Watermark Size Margin W ≤ 1024 or H ≤ 1024 48×48 pixels 32px W > 1024 and H > 1024 96×96 pixels 64px

The tool automatically detects which size to use, but you can override it with --force-small or --force-large if needed.

The Technical Magic: Reverse Alpha Blending

Now for the interesting part — how does this actually work?

Understanding How Gemini Adds Watermarks

Most people assume watermarks are simply “stamped” onto images. But Gemini uses something more sophisticated: alpha blending.

The formula is:

watermarked = α × logo + (1 - α) × original

Where:

  • watermarked is the final pixel value you see
  • original is the original pixel value (what we want to recover)
  • α (alpha) is the transparency factor (0 = fully transparent, 1 = fully opaque)

This creates that semi-transparent overlay effect you see in Gemini images.

Rebuild the Alpha Map

Here’s the clever part. This is the key point. By statistically analyzing and comparing values related to Alpha, we can reconstruct an Alpha table that is either correct or very close to it.

Reversing the Process

Now that we know α for every pixel, we can algebraically reverse the blending formula:

original = (watermarked - α × 255) / (1 - α)

This is the key equation. For each pixel in the watermarked region:

  1. Get the alpha value from our pre-computed alpha map
  2. Apply the reverse formula
  3. Clamp the result to valid range [0, 255]

The result? Mathematically accurate reconstruction of the original pixel values.

Why This Works Better Than Alternatives

Other approaches like:

  • Inpainting — Guesses what pixels should be, often creates artifacts
  • Clone stamping — Manual and inconsistent
  • Content-aware fill — Can produce blurry or distorted results
  • Size — My solution is quite fast, simple and tiny (about 4MB disk space)

Our approach doesn’t guess. It calculates the exact original values using the known watermark pattern. The only error comes from 8-bit quantization (±1 in pixel value), which is imperceptible to the human eye.

Edge Handling

You might wonder about the watermark edges, where pixels are semi-transparent. This is actually handled beautifully by the alpha map itself. Edge pixels have low alpha values (say, 0.1 or 0.2), and the reverse formula naturally accounts for this, producing smooth transitions with no visible seams.

Open Source

I’ve released this tool under the MIT License. The complete source code is available on GitHub:

🔗 https://github.com/allenk/GeminiWatermarkTool

Current Version: v0.2.6

From Images to Video: Removing Veo Watermarks

The same mathematical framework that makes GeminiWatermarkTool effective on images turns out to be directly applicable to video — specifically Google Veo’s visible text watermark that appears on every generated clip.

The challenge with video is different from images in several important ways. You’re no longer dealing with one static overlay on one frame — you’re dealing with a watermark composited onto every frame of a compressed video stream. JPEG artifacts are relatively simple; H.264/H.265 temporal compression introduces a whole new class of reconstruction challenges. The watermark interacts with motion estimation, quantization, and inter-frame prediction in ways that don’t occur in still images.

To address this, I built VeoWatermarkRemover — a standalone tool that processes MP4 files directly, applying per-frame reverse alpha blending with remastered alpha maps derived from multi-source frame differencing across dozens of Veo-generated videos.

Like GeminiWatermarkTool, VeoWatermarkRemover uses the same compact FDnCNN engine (~1.2 MB) for residual artifact cleanup, with Vulkan GPU acceleration and automatic CPU fallback. The entire processing pipeline runs locally — no cloud upload, no API calls,
no privacy concerns.

Performance on tested hardware:
- 720p: ~50 fps (faster than real-time)
- 1080p: ~18 fps
- Audio tracks are preserved untouched

The tool ships as a single executable with zero dependencies — the same philosophy as GWT. Drag a video file onto it, get a clean MP4 back.

Currently available as pre-built binaries for Windows (x64), macOS (Intel + Apple Silicon), and Linux (x64). Source code release is in progress — I’m working through video codec licensing details to ensure full compliance before opening the repository.
All builds are generated transparently via GitHub Actions, and the build logs are public.

Download and details: https://github.com/allenk/VeoWatermarkRemover

Update: AI-Powered Residual Cleanup

The Remaining Challenge

In my earlier discussion of Software Inpainting, I mentioned that reverse alpha blending is mathematically exact — but only on unmodified images. When images have been resized, recompressed, or screenshot-captured after watermarking, the pixel-perfect math breaks down. Integer rounding and interpolation leave faint residual artifacts, particularly along the watermark’s sparkle edges and four corner tips.

The three software inpainting methods (NS, TELEA, Soft Inpaint) improved this significantly, but they all share a fundamental limitation: they operate on hand-crafted mathematical rules — fluid dynamics equations, distance-weighted interpolation, or Gaussian blending. None of them understand what a clean image should look like.

Enter FDnCNN: A Neural Network That Learned “Normal”

v0.2.5 introduces AI Denoise — a GPU-accelerated neural network that was trained on hundreds of thousands of image pairs to learn what “clean” vs “noisy” image content looks like.

The model is FDnCNN (Flexible Denoising Convolutional Neural Network) from the KAIR research toolbox. It’s a remarkably simple architecture: 20 layers of 3×3 convolutions with ReLU activations. No fancy attention mechanisms, no U-Net skip connections — just a deep stack of learned filters.

How It Works

The key insight is residual learning with a tunable sigma parameter.

The network takes a 4-channel input: the 3 RGB channels of the image region, plus a sigma map — a uniform channel filled with σ/255.0 that tells the network "how much noise to expect." This allows a single trained model to handle noise levels from barely visible (σ=5) to aggressive (σ=75+), controlled by a slider in the UI.

Input:  [R, G, B, σ_map]    →    FDnCNN (20 Conv+ReLU layers)    →    Output: clean image

Running AI denoise on the entire watermark region would destroy background detail — the network doesn’t know which pixels are damaged and which are fine. This was a lesson learned the hard way during development (the first version simply blurred everything uniformly).

The solution borrows the same alpha map gradient mask technique from the software inpainting pipeline:

1. Compute Sobel gradient of the watermark alpha map
2. Normalize → sqrt (gamma) → dilate → blur → gradient weight mask
3. Run FDnCNN inference on the padded region
4. Per-pixel blend:
result = mask × AI_denoised + (1 − mask) × original

mask ≈ 0 (clean background) → pixel untouched
mask ≈ 1 (sparkle edge) → pixel replaced by AI result

Ths gives us the best of both worlds: the neural network’s ability to reconstruct plausible image content at the edges, combined with pixel-perfect preservation of everything the math already got right.

Performance

The model is only 1.3 MB (FP16 quantized) and embedded directly in the executable via C header arrays — no external model files to distribute.

Inference runs on NCNN with Vulkan GPU acceleration. On a modern GPU, processing a single watermark region takes < 10 ms. CPU fallback (OpenMP multi-threaded) takes about 20 ms. Either way, it’s imperceptible in the UI workflow.

ComponentDetailModelFDnCNN Color, 20 layers, 64 channelsSize1.3 MB (FP16), ~668K parametersRuntimeNCNN (Vulkan GPU + CPU fallback)Inference~7 ms GPU / ~20 ms CPU per regionParametersSigma (1–150), Strength (0–300%)DefaultSigma=50, Strength=120%

Results

The default parameters (strength=120%, sigma=50) produce clean results on most resized watermarks. Strength above 100% expands the gradient mask coverage to catch the faint corner artifacts that 100% misses. For particularly stubborn residuals, sigma can be pushed to 75+ at the cost of slightly more aggressive smoothing.

AI Denoise is now the recommended default cleanup method. When the tool starts, it automatically detects available Vulkan GPUs and initializes the neural network. If initialization fails (no GPU, no Vulkan drivers), it falls back to the NS method transparently.

Advanced CLI for Automation

v0.2.5 also adds new CLI options for handling non-standard watermark positions — a common scenario when images are resized or cropped after watermarking.

The Problem

The standard 3-stage detection assumes the watermark is at Gemini’s default position (bottom-right, with known margins). But social media platforms, screenshot tools, and image editors often resize or crop images, shifting the watermark to unpredictable positions.

Region + Snap + Denoise Pipeline

# Search bottom-right area with multi-scale snap, apply AI cleanup
GeminiWatermarkTool -i photo.jpg -o clean.jpg \
--fallback-region br:10,10,500,500 \
--snap --snap-max-size 320 \
--denoise ai

--fallback-region defines a search area using corner-relative coordinates (br: = bottom-right, margins from edge). When standard detection fails, --snap runs multi-scale NCC template matching within this region to find the watermark at any size from 32–320px.

A separate --snap-threshold (default 60%) prevents false positives — snap's single-NCC matching is more prone to false matches than the full 3-stage detection pipeline, so it needs a higher confidence bar.

# Batch: standard detection first, fallback + snap for missed images
GeminiWatermarkTool -i ./photos/ -o ./clean/ \
--fallback-region br:10,10,500,500 --snap --denoise ai

# Force a known region for all images (same watermark position)
GeminiWatermarkTool -i ./slides/ -o ./clean/ \
--force --region 500,800,160,160 --denoise ai --sigma 75

All new options are fully backward-compatible — existing scripts and agent integrations work unchanged without any new flags.

Third-Party Components

NCNN, BSD-3-Clause, Neural network inference (Vulkan GPU + CPU)
FDnCNN weights, MIT, Pre-trained denoising model
volk, MIT, Vulkan dynamic loader

Building from Source

# Clone the repository
git clone https://github.com/allenk/GeminiWatermarkTool.git
cd GeminiWatermarkTool

# Configure with vcpkg
cmake -B build -S . \
-DCMAKE_TOOLCHAIN_FILE=[vcpkg-root]/scripts/buildsystems/vcpkg.cmake \
-DVCPKG_TARGET_TRIPLET=x64-windows-static \
-DSTANDALONE_MODE=ON

# Build
cmake --build build --config Release

Important Disclaimer

⚠️ This tool is provided for personal and educational use.

A few things to keep in mind:

  1. Always backup your original images before processing. The tool can modify files in-place.
  2. Respect content policies. The removal of watermarks may have legal or ethical implications depending on how you use the resulting images.
  3. This tool removes visible watermarks only. It does not affect any invisible/steganographic watermarks that may be embedded in the image data.

If you have feature requests or find bugs, please open an issue on GitHub.

Conclusion

What started as a personal annoyance turned into an interesting technical challenge — one that kept growing. The mathematics behind alpha blending are elegant, and it was deeply satisfying to discover that the same core approach extends from still images all the way to compressed video streams.

If you’re working with Gemini-generated images, give GeminiWatermarkTool a try. If you’re working with Veo-generated videos, VeoWatermarkRemover applies the same precision-first philosophy to video. And if you’re a developer interested in image or video processing, I hope the source code and this write-up provide some useful insights.

GeminiWatermarkTool (images): GitHub · Download
VeoWatermarkRemover (video): GitHub · Download

Allen Kuo
Graphics / Systems engineer focusing on Android, GPU computing, and deterministic image processing.

GitHub: https://github.com/allenk
LinkedIn: Allen Kuo | LinkedIn

If you found this useful, consider giving the repo a ⭐ on GitHub. Questions or feedback? Leave a comment below or reach out on GitHub.

Tags: #AI #Gemini #ImageProcessing #OpenSource #Tools #Programming #CPlusPlus