How to Build a Wildlife Camera Trap Classifier with YOLOv8 and FastAPI

Camera traps generate thousands of images per week, and most of them are empty frames triggered by wind or shadows. Manually sorting through that pile is brutal. YOLOv8 can detect and classify animals in those images automatically, and wrapping it in a FastAPI service lets you process uploads from the field over HTTP.

Here’s a quick taste of what the end result looks like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
from ultralytics import YOLO

model = YOLO("yolov8n.pt")

# COCO animal class IDs: 14=bird, 15=cat, 16=dog, 17=horse, 18=sheep, 19=cow, 20=elephant, 21=bear, 22=zebra, 23=giraffe
animal_classes = [14, 15, 16, 17, 18, 19, 20, 21, 22, 23]

results = model.predict(source="trap_image_0042.jpg", classes=animal_classes, conf=0.4)

for box in results[0].boxes:
    label = model.names[int(box.cls[0])]
    score = float(box.conf[0])
    print(f"{label}: {score:.2f}")

That filters predictions to animal-only classes from COCO and drops anything below 40% confidence. Most empty frames return zero detections, so you immediately know which images are worth looking at.

Filter COCO Animal Classes

The pretrained YOLOv8 model knows 80 COCO classes. Ten of those are animals. Rather than fine-tuning from scratch, you can get surprisingly far by just filtering to those classes. This works well for common wildlife like deer (closest to “horse” or “cow” in COCO), bears, birds, and elephants.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
from ultralytics import YOLO

model = YOLO("yolov8m.pt")  # medium model — better accuracy for smaller animals

# All COCO animal classes
ANIMAL_CLASSES = {
    14: "bird",
    15: "cat",
    16: "dog",
    17: "horse",
    18: "sheep",
    19: "cow",
    20: "elephant",
    21: "bear",
    22: "zebra",
    23: "giraffe",
}

results = model.predict(
    source="trap_images/",  # process an entire directory
    classes=list(ANIMAL_CLASSES.keys()),
    conf=0.35,
    imgsz=1280,  # higher resolution catches smaller animals
    save=True,   # saves annotated images to runs/detect/
)

# Summarize what was found
for result in results:
    filename = result.path
    boxes = result.boxes
    if len(boxes) == 0:
        continue
    species = [model.names[int(b.cls[0])] for b in boxes]
    print(f"{filename}: {', '.join(species)}")

Setting imgsz=1280 is important for camera trap images. These are often high-resolution with small animals in the frame. The default 640px resize can shrink a distant deer to just a few pixels, killing detection accuracy.

Processing a whole directory at once is faster than looping file by file because Ultralytics batches the inference internally.

Preprocess Camera Trap Images

Camera trap photos have quirks that hurt detection: infrared night shots, motion blur, and metadata stamps burned into the image. Cleaning these up before inference makes a real difference.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
import cv2
import numpy as np
from pathlib import Path


def preprocess_trap_image(image_path: str) -> np.ndarray:
    """Load and clean a camera trap image for better detection."""
    img = cv2.imread(image_path)
    if img is None:
        raise ValueError(f"Could not read image: {image_path}")

    height, width = img.shape[:2]

    # Crop out the bottom info bar (timestamps, temperature readings)
    # Most camera traps put this in the bottom 8-10% of the image
    crop_ratio = 0.92
    img = img[: int(height * crop_ratio), :]

    # If grayscale IR image, convert to 3-channel so YOLO doesn't choke
    if len(img.shape) == 2 or img.shape[2] == 1:
        img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)

    # Apply CLAHE for contrast enhancement (helps with dark night shots)
    lab = cv2.cvtColor(img, cv2.COLOR_BGR2LAB)
    clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
    lab[:, :, 0] = clahe.apply(lab[:, :, 0])
    img = cv2.cvtColor(lab, cv2.COLOR_LAB2BGR)

    return img


def batch_preprocess(input_dir: str, output_dir: str) -> list[str]:
    """Preprocess all images in a directory."""
    in_path = Path(input_dir)
    out_path = Path(output_dir)
    out_path.mkdir(parents=True, exist_ok=True)

    processed = []
    for img_file in in_path.glob("*.jpg"):
        try:
            img = preprocess_trap_image(str(img_file))
            dest = str(out_path / img_file.name)
            cv2.imwrite(dest, img)
            processed.append(dest)
        except ValueError as e:
            print(f"Skipping {img_file.name}: {e}")

    return processed

The CLAHE step (Contrast Limited Adaptive Histogram Equalization) is the biggest win here. Night-time infrared images often have animals that are nearly invisible against a dark background. CLAHE pulls out enough contrast for YOLO to find them.

Batch Inference with Confidence Filtering

For a real deployment, you want structured output: which files had animals, what species, where in the frame, and how confident the model is. You also want to filter aggressively – camera traps are noisy and false positives waste biologist time.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
import json
from dataclasses import dataclass, asdict
from ultralytics import YOLO


@dataclass
class Detection:
    filename: str
    species: str
    confidence: float
    bbox: list[int]


def classify_trap_images(
    image_dir: str,
    model_path: str = "yolov8m.pt",
    min_confidence: float = 0.45,
    min_area: int = 2000,
) -> list[Detection]:
    """Run wildlife detection on a directory of camera trap images.

    Args:
        image_dir: Path to directory of .jpg images.
        model_path: YOLOv8 model weights.
        min_confidence: Minimum confidence threshold.
        min_area: Minimum bounding box area in pixels (filters out tiny false positives).
    """
    model = YOLO(model_path)
    animal_classes = [14, 15, 16, 17, 18, 19, 20, 21, 22, 23]

    results = model.predict(
        source=image_dir,
        classes=animal_classes,
        conf=min_confidence,
        imgsz=1280,
        verbose=False,
    )

    detections = []
    for result in results:
        filename = result.path
        for box in result.boxes:
            x1, y1, x2, y2 = map(int, box.xyxy[0])
            area = (x2 - x1) * (y2 - y1)

            # Skip tiny detections — usually noise
            if area < min_area:
                continue

            detections.append(Detection(
                filename=filename,
                species=model.names[int(box.cls[0])],
                confidence=round(float(box.conf[0]), 3),
                bbox=[x1, y1, x2, y2],
            ))

    return detections


# Run it
detections = classify_trap_images("cleaned_images/", min_confidence=0.45)
print(f"Found {len(detections)} animals across all images")

# Save results to JSON
with open("detections.json", "w") as f:
    json.dump([asdict(d) for d in detections], f, indent=2)

The min_area filter is critical for camera trap work. YOLO sometimes flags tiny regions of bark or leaf shadow as animals. Requiring a minimum bounding box area of 2000 pixels eliminates most of these without losing real detections.

Serve Predictions with FastAPI

Once the classifier works locally, wrap it in an API so field devices or web dashboards can submit images over HTTP. Use FastAPI’s lifespan context manager to load the model once at startup.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
from contextlib import asynccontextmanager
from dataclasses import asdict

import cv2
import numpy as np
from fastapi import FastAPI, File, UploadFile
from fastapi.responses import JSONResponse
from ultralytics import YOLO

ANIMAL_CLASSES = [14, 15, 16, 17, 18, 19, 20, 21, 22, 23]
MIN_CONFIDENCE = 0.45
MIN_AREA = 2000

ml_models = {}


@asynccontextmanager
async def lifespan(app: FastAPI):
    ml_models["detector"] = YOLO("yolov8m.pt")
    yield
    ml_models.clear()


app = FastAPI(lifespan=lifespan)


@app.post("/classify")
async def classify_image(file: UploadFile = File(...)):
    contents = await file.read()
    np_arr = np.frombuffer(contents, np.uint8)
    img = cv2.imdecode(np_arr, cv2.IMREAD_COLOR)

    if img is None:
        return JSONResponse(
            status_code=400,
            content={"error": "Could not decode image"},
        )

    model = ml_models["detector"]
    results = model.predict(
        source=img,
        classes=ANIMAL_CLASSES,
        conf=MIN_CONFIDENCE,
        imgsz=1280,
        verbose=False,
    )

    detections = []
    for box in results[0].boxes:
        x1, y1, x2, y2 = map(int, box.xyxy[0])
        area = (x2 - x1) * (y2 - y1)
        if area < MIN_AREA:
            continue
        detections.append({
            "species": model.names[int(box.cls[0])],
            "confidence": round(float(box.conf[0]), 3),
            "bbox": [x1, y1, x2, y2],
        })

    return JSONResponse(content={
        "filename": file.filename,
        "animals_detected": len(detections),
        "detections": detections,
    })


@app.post("/classify-batch")
async def classify_batch(files: list[UploadFile] = File(...)):
    model = ml_models["detector"]
    all_results = []

    for file in files:
        contents = await file.read()
        np_arr = np.frombuffer(contents, np.uint8)
        img = cv2.imdecode(np_arr, cv2.IMREAD_COLOR)
        if img is None:
            all_results.append({"filename": file.filename, "error": "Could not decode"})
            continue

        results = model.predict(
            source=img,
            classes=ANIMAL_CLASSES,
            conf=MIN_CONFIDENCE,
            imgsz=1280,
            verbose=False,
        )

        detections = []
        for box in results[0].boxes:
            x1, y1, x2, y2 = map(int, box.xyxy[0])
            area = (x2 - x1) * (y2 - y1)
            if area < MIN_AREA:
                continue
            detections.append({
                "species": model.names[int(box.cls[0])],
                "confidence": round(float(box.conf[0]), 3),
                "bbox": [x1, y1, x2, y2],
            })

        all_results.append({
            "filename": file.filename,
            "animals_detected": len(detections),
            "detections": detections,
        })

    return JSONResponse(content={"results": all_results})

Start the server and test it:

1
2
3
pip install fastapi uvicorn python-multipart ultralytics opencv-python-headless

uvicorn wildlife_api:app --host 0.0.0.0 --port 8000

1
2
3
4
5
6
7
8
9
# Single image
curl -X POST "http://localhost:8000/classify" \
  -F "file=@trap_images/IMG_0042.jpg"

# Batch upload
curl -X POST "http://localhost:8000/classify-batch" \
  -F "files=@trap_images/IMG_0042.jpg" \
  -F "files=@trap_images/IMG_0043.jpg" \
  -F "files=@trap_images/IMG_0044.jpg"

The batch endpoint processes multiple images in a single request. This is more efficient than sending one image at a time because it avoids repeated HTTP overhead and lets you upload an entire SD card’s worth of photos in bulk.

Common Errors and Fixes

TypeError: expected str, bytes or os.PathLike object, not ndarray – You’re passing a NumPy array to a function that expects a file path. When using FastAPI uploads, decode bytes with cv2.imdecode() first, then pass the resulting array to model.predict(source=img). The source parameter in Ultralytics accepts both paths and arrays.

Empty detections on night images – Infrared camera trap photos are low-contrast grayscale. The CLAHE preprocessing step described above helps significantly. Also make sure IR images are converted to 3-channel BGR before prediction – YOLOv8 expects 3-channel input.

422 Unprocessable Entity from FastAPI – This usually means the request body format is wrong. For file uploads, use -F "file=@path" in curl, not -d or --data. Make sure python-multipart is installed – FastAPI silently fails without it.

False positives on tree stumps and rocks – COCO-trained models sometimes see “bear” in large dark objects. Increase min_confidence to 0.55 or add the min_area filter. For serious deployments, fine-tune on actual camera trap data from datasets like Caltech Camera Traps or Snapshot Serengeti.

CUDA out of memory with imgsz=1280 – Higher resolution needs more GPU memory. Drop to imgsz=640 or switch to yolov8n.pt (nano). If you’re on CPU only, yolov8n.pt at 640px is the sweet spot for speed vs. accuracy.

Model downloads hang or fail – Ultralytics downloads weights on first use. If you’re deploying to a server without internet, download the weights locally first: yolo detect predict model=yolov8m.pt source=test.jpg on a machine with internet, then copy the .pt file to your server.

Filter COCO Animal Classes#

Preprocess Camera Trap Images#

Batch Inference with Confidence Filtering#

Serve Predictions with FastAPI#

Common Errors and Fixes#

Related Guides#

About the Author

Filter COCO Animal Classes

Preprocess Camera Trap Images

Batch Inference with Confidence Filtering

Serve Predictions with FastAPI

Common Errors and Fixes

Related Guides