How to Build Watermark Detection for AI-Generated Images

AI image generators like Stable Diffusion, DALL-E, and Midjourney embed invisible watermarks into every image they produce. These watermarks survive cropping, compression, and even screenshots. If you need to verify whether an image came from a generative model, you can detect these watermarks programmatically.

Here’s the fastest way to check a single image for an embedded watermark using the invisible-watermark library:

1
pip install invisible-watermark opencv-python-headless Pillow numpy scipy

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
from imwatermark import WatermarkDecoder
import cv2
import numpy as np

img = cv2.imread("suspect_image.png")

# Try decoding a watermark using DwtDctSvd method (used by Stable Diffusion)
decoder = WatermarkDecoder("bytes", 32)
watermark_bytes = decoder.decode(img, "dwtDctSvd")

decoded_text = watermark_bytes.decode("utf-8", errors="replace")
print(f"Decoded watermark: {repr(decoded_text)}")

# Stable Diffusion embeds "SDV2" as its watermark
if "SDV2" in decoded_text or "StableDiffusion" in decoded_text:
    print("This image was likely generated by Stable Diffusion.")
else:
    print("No known AI watermark found (or watermark was stripped).")

That covers the quick detection path. Now let’s understand what’s actually happening under the hood and build something more thorough.

How Invisible Watermarking Works

Invisible watermarks don’t modify pixels in ways you can see. Instead, they embed information in the frequency domain of the image. Here’s the basic idea:

The image gets transformed from spatial domain (pixels) to frequency domain using a transform like DCT (Discrete Cosine Transform) or DWT (Discrete Wavelet Transform).
The watermark data gets encoded into specific frequency coefficients – typically mid-frequency bands where changes are hard to perceive but survive compression.
The modified frequency data gets transformed back into pixel space.

Stable Diffusion uses a DWT-DCT-SVD pipeline: Discrete Wavelet Transform first, then DCT on the wavelet coefficients, then Singular Value Decomposition to embed the payload. This layered approach makes the watermark resilient to JPEG compression, resizing, and light editing.

The key insight: watermarked images have statistically different frequency distributions than non-watermarked ones. You can detect this even without knowing the exact watermark payload.

Frequency Domain Analysis with DCT

Before reaching for a library, it helps to understand the raw signal. You can analyze an image’s frequency spectrum to spot anomalies that suggest watermark embedding. Watermarked images often show unusual energy patterns in mid-frequency DCT coefficients.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
import numpy as np
from PIL import Image
from scipy.fft import dctn, idctn

def analyze_frequency_spectrum(image_path: str) -> dict:
    """Analyze DCT frequency spectrum for watermark indicators."""
    img = Image.open(image_path).convert("L")  # Convert to grayscale
    pixels = np.array(img, dtype=np.float64)

    # Apply 2D DCT to the entire image
    dct_coeffs = dctn(pixels, type=2, norm="ortho")

    h, w = dct_coeffs.shape

    # Split into frequency bands
    low_freq = dct_coeffs[:h // 4, :w // 4]
    mid_freq_region = dct_coeffs[h // 4:h // 2, w // 4:w // 2]
    high_freq = dct_coeffs[h // 2:, w // 2:]

    # Calculate energy in each band
    low_energy = np.sum(np.abs(low_freq) ** 2)
    mid_energy = np.sum(np.abs(mid_freq_region) ** 2)
    high_energy = np.sum(np.abs(high_freq) ** 2)
    total_energy = np.sum(np.abs(dct_coeffs) ** 2)

    # Watermarked images tend to have elevated mid-frequency energy
    mid_ratio = mid_energy / total_energy if total_energy > 0 else 0

    # Check for periodic patterns in mid-frequency band
    # Watermarks create subtle but regular structures
    mid_std = np.std(mid_freq_region)
    mid_mean = np.mean(np.abs(mid_freq_region))
    regularity_score = mid_std / mid_mean if mid_mean > 0 else 0

    return {
        "low_energy_pct": round(low_energy / total_energy * 100, 2),
        "mid_energy_pct": round(mid_energy / total_energy * 100, 2),
        "high_energy_pct": round(high_energy / total_energy * 100, 2),
        "mid_regularity": round(regularity_score, 4),
        "likely_watermarked": mid_ratio > 0.02 and regularity_score < 1.5,
    }

# Test with an image
result = analyze_frequency_spectrum("suspect_image.png")
for key, value in result.items():
    print(f"  {key}: {value}")

The likely_watermarked heuristic here is rough. Natural photographs typically show a steep falloff from low to high frequencies. Watermarked images have slightly elevated mid-band energy because that’s where the payload lives. A mid_energy_pct above 2% combined with a regularity_score below 1.5 suggests structured data in the mid-frequencies, which is a watermark signal.

This won’t give you the same confidence as decoding a known watermark format, but it works as a first-pass filter when you don’t know which generator produced the image.

Encoding and Decoding with imwatermark

The invisible-watermark library (PyPI package: invisible-watermark) is what Stable Diffusion actually uses. It supports multiple encoding methods. Here’s how to encode a watermark into an image and then decode it back:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
import cv2
import numpy as np
from imwatermark import WatermarkEncoder, WatermarkDecoder

def encode_watermark(input_path: str, output_path: str, message: str) -> None:
    """Embed an invisible watermark into an image."""
    img = cv2.imread(input_path)
    if img is None:
        raise FileNotFoundError(f"Could not read image: {input_path}")

    encoder = WatermarkEncoder()
    # Content length must be specified in bits (each char = 8 bits)
    encoder.set_watermark("bytes", message.encode("utf-8"))
    encoded_img = encoder.encode(img, "dwtDctSvd")

    cv2.imwrite(output_path, encoded_img)
    print(f"Watermark embedded: '{message}' -> {output_path}")


def decode_watermark(image_path: str, expected_length: int) -> str:
    """Attempt to decode a watermark from an image."""
    img = cv2.imread(image_path)
    if img is None:
        raise FileNotFoundError(f"Could not read image: {image_path}")

    decoder = WatermarkDecoder("bytes", expected_length)
    watermark_bytes = decoder.decode(img, "dwtDctSvd")

    return watermark_bytes.decode("utf-8", errors="replace")


# Create a test image (solid gradient so we have something to work with)
test_img = np.zeros((512, 512, 3), dtype=np.uint8)
for i in range(512):
    test_img[i, :] = [i // 2, 100, 255 - i // 2]  # Blue-to-red gradient

cv2.imwrite("/tmp/test_original.png", test_img)

# Encode and decode
encode_watermark("/tmp/test_original.png", "/tmp/test_watermarked.png", "SDV2")
decoded = decode_watermark("/tmp/test_watermarked.png", 32)  # 4 chars * 8 bits = 32
print(f"Decoded: {repr(decoded)}")

The expected_length parameter in WatermarkDecoder is the number of bits to extract. Stable Diffusion uses a 32-bit watermark (4 bytes encoding “SDV2”). If you don’t know the payload length, you’ll need to try multiple lengths – 32, 64, and 128 are common.

The dwtDctSvd method is the most resilient but also the slowest. The library also supports dwtDct and rivaGan methods. For AI-generated image detection, stick with dwtDctSvd since that’s what the major generators use.

Batch Detection Script

Here’s a practical script that scans a folder of images and flags ones that likely contain AI watermarks. It tries multiple decoding methods and payload lengths, then falls back to frequency analysis.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
import os
import sys
import cv2
import numpy as np
from pathlib import Path
from imwatermark import WatermarkDecoder
from PIL import Image
from scipy.fft import dctn

KNOWN_WATERMARKS = ["SDV2", "StableDiffusion", "SDXL"]
SUPPORTED_EXTENSIONS = {".png", ".jpg", ".jpeg", ".webp", ".bmp"}

def try_decode_watermark(image_path: str) -> dict:
    """Try multiple decoding strategies on a single image."""
    img = cv2.imread(image_path)
    if img is None:
        return {"error": f"Could not read {image_path}"}

    results = {"path": image_path, "watermark_found": False, "method": None, "payload": None}

    # Try different bit lengths and methods
    methods = ["dwtDctSvd", "dwtDct"]
    bit_lengths = [32, 64, 128]

    for method in methods:
        for bits in bit_lengths:
            try:
                decoder = WatermarkDecoder("bytes", bits)
                raw = decoder.decode(img, method)
                text = raw.decode("utf-8", errors="replace")

                # Check for known watermark strings
                for known in KNOWN_WATERMARKS:
                    if known in text:
                        results["watermark_found"] = True
                        results["method"] = method
                        results["payload"] = text.strip("\x00").strip()
                        return results
            except Exception:
                continue

    return results


def frequency_heuristic(image_path: str) -> float:
    """Return a 0-1 score indicating likelihood of watermark presence."""
    try:
        img = Image.open(image_path).convert("L")
        pixels = np.array(img, dtype=np.float64)

        # Resize to standard size for consistent analysis
        if pixels.shape[0] > 512 or pixels.shape[1] > 512:
            img_resized = img.resize((512, 512), Image.LANCZOS)
            pixels = np.array(img_resized, dtype=np.float64)

        dct_coeffs = dctn(pixels, type=2, norm="ortho")
        h, w = dct_coeffs.shape

        mid_freq = dct_coeffs[h // 4:h // 2, w // 4:w // 2]
        total_energy = np.sum(np.abs(dct_coeffs) ** 2)
        mid_energy = np.sum(np.abs(mid_freq) ** 2)

        mid_ratio = mid_energy / total_energy if total_energy > 0 else 0
        mid_std = np.std(mid_freq)
        mid_mean = np.mean(np.abs(mid_freq))
        regularity = mid_std / mid_mean if mid_mean > 0 else float("inf")

        # Score: higher means more likely watermarked
        score = 0.0
        if mid_ratio > 0.02:
            score += 0.4
        if mid_ratio > 0.05:
            score += 0.2
        if regularity < 1.5:
            score += 0.3
        if regularity < 1.0:
            score += 0.1

        return min(score, 1.0)
    except Exception:
        return 0.0


def scan_folder(folder_path: str) -> list:
    """Scan all images in a folder for AI watermarks."""
    folder = Path(folder_path)
    if not folder.is_dir():
        print(f"Error: {folder_path} is not a directory")
        sys.exit(1)

    image_files = [
        f for f in folder.iterdir()
        if f.suffix.lower() in SUPPORTED_EXTENSIONS
    ]

    print(f"Scanning {len(image_files)} images in {folder_path}...\n")
    flagged = []

    for img_path in sorted(image_files):
        str_path = str(img_path)

        # Step 1: Try direct watermark decoding
        result = try_decode_watermark(str_path)

        if result.get("watermark_found"):
            result["confidence"] = "HIGH"
            result["freq_score"] = None
            flagged.append(result)
            print(f"  [HIGH]  {img_path.name} — watermark: {result['payload']} ({result['method']})")
            continue

        # Step 2: Frequency heuristic fallback
        freq_score = frequency_heuristic(str_path)
        if freq_score >= 0.5:
            result["confidence"] = "MEDIUM" if freq_score < 0.7 else "HIGH"
            result["freq_score"] = freq_score
            flagged.append(result)
            print(f"  [{result['confidence']:6s}] {img_path.name} — freq score: {freq_score:.2f}")
        else:
            print(f"  [  OK  ] {img_path.name} — no watermark detected")

    print(f"\n--- Summary ---")
    print(f"Total images: {len(image_files)}")
    print(f"Flagged: {len(flagged)}")
    return flagged


if __name__ == "__main__":
    target = sys.argv[1] if len(sys.argv) > 1 else "."
    scan_folder(target)

Save this as detect_watermarks.py and run it against a folder:

1
python detect_watermarks.py /path/to/images/

The script uses a two-tier approach: it first tries to decode known watermark formats directly, then falls back to frequency analysis for images that might use proprietary or unknown watermarking schemes.

Common Errors and Fixes

ModuleNotFoundError: No module named 'imwatermark'

The PyPI package name differs from the import name. Install it with:

1
pip install invisible-watermark

Not pip install imwatermark – that’s wrong and will fail.

cv2.error: (-215:Assertion failed) !_src.empty()

OpenCV couldn’t read the image file. This happens with corrupted files, unsupported formats, or wrong paths. Verify the file exists and try opening it with PIL first:

1
2
3
from PIL import Image
img = Image.open("suspect.png")
print(img.size, img.mode)  # Should print dimensions and color mode

If PIL can open it but OpenCV can’t, convert it:

1
2
3
4
5
6
import cv2
import numpy as np
from PIL import Image

pil_img = Image.open("suspect.webp").convert("RGB")
cv2_img = cv2.cvtColor(np.array(pil_img), cv2.COLOR_RGB2BGR)

UnicodeDecodeError when decoding watermark bytes

The decoded bytes aren’t valid UTF-8, which usually means no watermark exists or you’re using the wrong bit length. Always use errors="replace" in your decode call:

1
watermark_bytes.decode("utf-8", errors="replace")

Watermark detected in a real photograph

False positives happen. JPEG compression artifacts, certain camera sensor patterns, and some editing software can create frequency-domain structures that look like watermarks. The frequency heuristic in particular will flag some natural images. Always use the direct decode method as your primary signal and treat frequency analysis as supplementary evidence.

Script runs slowly on large images

The DWT-DCT-SVD decode is computationally expensive on high-resolution images. Resize before processing if speed matters more than accuracy:

1
2
3
4
img = cv2.imread("huge_image.png")
img_resized = cv2.resize(img, (1024, 1024))
decoder = WatermarkDecoder("bytes", 32)
result = decoder.decode(img_resized, "dwtDctSvd")

This works because watermarks are embedded across the entire frequency spectrum and survive downscaling.

Where This Fits in a Larger System

Watermark detection is one signal among many. For a production AI-content detection pipeline, combine it with:

Metadata analysis – check EXIF data for generator signatures (some tools leave traces in Software or Comment fields)
Statistical classifiers – train a model on spectral features from known AI-generated vs real images
C2PA content credentials – the newer standard where provenance data is cryptographically signed into the file

The watermark approach has a clear advantage: it’s deterministic. If you decode “SDV2” from an image, that image went through Stable Diffusion’s pipeline. No probability, no threshold tuning. The downside is that watermarks can be stripped by re-encoding, adding noise, or running the image through another model. Treat positive detection as strong evidence, but don’t treat negative detection as proof of authenticity.

How Invisible Watermarking Works#

Frequency Domain Analysis with DCT#

Encoding and Decoding with imwatermark#

Batch Detection Script#

Common Errors and Fixes#

Where This Fits in a Larger System#

Related Guides#

About the Author