Vehicle counting from video boils down to three problems: detect vehicles in each frame, track them across frames, and decide when one crosses a line. YOLOv8 handles detection. A simple centroid tracker handles persistence across frames. A line-crossing test handles the count.

Here’s the install:

1
pip install ultralytics opencv-python numpy

And a minimal detection loop to make sure everything works:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
from ultralytics import YOLO
import cv2

model = YOLO("yolov8n.pt")
cap = cv2.VideoCapture("traffic.mp4")

ret, frame = cap.read()
if ret:
    results = model(frame, classes=[2, 3, 5, 7])  # car, motorcycle, bus, truck
    print(f"Detected {len(results[0].boxes)} vehicles in first frame")

cap.release()

The classes parameter filters COCO class IDs so you only get vehicles. Class 2 is car, 3 is motorcycle, 5 is bus, 7 is truck. This keeps the pipeline focused and avoids false triggers from pedestrians or traffic signs.

Centroid Tracker

Object detection gives you bounding boxes per frame, but it doesn’t tell you which box in frame N corresponds to which box in frame N+1. You need a tracker for that.

A centroid tracker works by computing the center point of each bounding box, then matching centroids between frames using Euclidean distance. When a centroid disappears for too many frames, the object is deregistered.

This implementation is lightweight and handles moderate traffic well. For heavy occlusion or very crowded scenes, you’d want DeepSORT or ByteTrack instead.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
import numpy as np
from collections import OrderedDict

class CentroidTracker:
    def __init__(self, max_disappeared=30):
        self.next_id = 0
        self.objects = OrderedDict()    # id -> centroid (cx, cy)
        self.disappeared = OrderedDict()  # id -> frames since last seen
        self.max_disappeared = max_disappeared

    def register(self, centroid):
        self.objects[self.next_id] = centroid
        self.disappeared[self.next_id] = 0
        self.next_id += 1

    def deregister(self, object_id):
        del self.objects[object_id]
        del self.disappeared[object_id]

    def update(self, bboxes):
        # bboxes: list of (x1, y1, x2, y2)
        if len(bboxes) == 0:
            for object_id in list(self.disappeared.keys()):
                self.disappeared[object_id] += 1
                if self.disappeared[object_id] > self.max_disappeared:
                    self.deregister(object_id)
            return self.objects

        centroids = []
        for (x1, y1, x2, y2) in bboxes:
            cx = int((x1 + x2) / 2)
            cy = int((y1 + y2) / 2)
            centroids.append((cx, cy))

        if len(self.objects) == 0:
            for c in centroids:
                self.register(c)
            return self.objects

        object_ids = list(self.objects.keys())
        object_centroids = list(self.objects.values())

        # Distance matrix between existing objects and new detections
        obj_arr = np.array(object_centroids)
        det_arr = np.array(centroids)
        dists = np.linalg.norm(obj_arr[:, np.newaxis] - det_arr[np.newaxis, :], axis=2)

        # Greedy matching: closest pairs first
        rows = dists.min(axis=1).argsort()
        cols = dists.argmin(axis=1)[rows]

        used_rows = set()
        used_cols = set()

        for (row, col) in zip(rows, cols):
            if row in used_rows or col in used_cols:
                continue
            if dists[row, col] > 80:  # max pixel distance threshold
                continue
            object_id = object_ids[row]
            self.objects[object_id] = centroids[col]
            self.disappeared[object_id] = 0
            used_rows.add(row)
            used_cols.add(col)

        unused_rows = set(range(len(object_centroids))) - used_rows
        unused_cols = set(range(len(centroids))) - used_cols

        for row in unused_rows:
            object_id = object_ids[row]
            self.disappeared[object_id] += 1
            if self.disappeared[object_id] > self.max_disappeared:
                self.deregister(object_id)

        for col in unused_cols:
            self.register(centroids[col])

        return self.objects

The max_disappeared parameter controls how many frames a vehicle can go undetected before the tracker drops it. Set it higher (50-60) for choppy video or low-FPS sources. The distance threshold of 80 pixels works for 1080p traffic footage – scale it down for lower resolutions.

Drawing the Counting Line

The counting line is a horizontal (or angled) line drawn across the road. Every time a tracked centroid crosses from one side to the other, you increment the count.

Pick the line position based on your camera angle. For a typical overhead or angled traffic camera, a horizontal line at about 60% of the frame height works well.

1
2
3
4
5
6
7
8
9
def crosses_line(prev_cy, curr_cy, line_y):
    """Check if a centroid crossed the line between frames.
    Returns 'down' if crossed downward, 'up' if upward, None otherwise.
    """
    if prev_cy < line_y and curr_cy >= line_y:
        return "down"
    if prev_cy >= line_y and curr_cy < line_y:
        return "up"
    return None

This function compares the previous and current y-coordinate of a centroid against the line’s y-position. It tells you which direction the vehicle crossed. For a road camera pointed down the street, “down” typically means vehicles moving away from the camera and “up” means approaching.

Full Pipeline

Here’s the complete pipeline that reads a video, detects vehicles, tracks them, counts line crossings, and writes an annotated output video.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
from ultralytics import YOLO
import cv2
import json

# --- Setup ---
model = YOLO("yolov8n.pt")
tracker = CentroidTracker(max_disappeared=40)
cap = cv2.VideoCapture("traffic.mp4")

width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(cap.get(cv2.CAP_PROP_FPS))

fourcc = cv2.VideoWriter_fourcc(*"mp4v")
out = cv2.VideoWriter("output_counted.mp4", fourcc, fps, (width, height))

line_y = int(height * 0.6)  # counting line at 60% of frame height
count_up = 0
count_down = 0
prev_centroids = {}  # object_id -> previous cy
counted_ids = set()  # don't double-count

# COCO class IDs for vehicles
vehicle_classes = [2, 3, 5, 7]  # car, motorcycle, bus, truck

frame_num = 0

while True:
    ret, frame = cap.read()
    if not ret:
        break

    frame_num += 1

    # --- Detect ---
    results = model(frame, classes=vehicle_classes, verbose=False)
    boxes = []
    for box in results[0].boxes:
        x1, y1, x2, y2 = box.xyxy[0].tolist()
        boxes.append((x1, y1, x2, y2))

    # --- Track ---
    objects = tracker.update(boxes)

    # --- Count crossings ---
    for object_id, (cx, cy) in objects.items():
        if object_id in prev_centroids and object_id not in counted_ids:
            prev_cy = prev_centroids[object_id]
            direction = crosses_line(prev_cy, cy, line_y)
            if direction == "down":
                count_down += 1
                counted_ids.add(object_id)
            elif direction == "up":
                count_up += 1
                counted_ids.add(object_id)

        prev_centroids[object_id] = cy

        # Draw centroid and ID
        cv2.circle(frame, (cx, cy), 5, (0, 255, 0), -1)
        cv2.putText(frame, f"ID:{object_id}", (cx - 20, cy - 15),
                    cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

    # --- Draw counting line and stats ---
    cv2.line(frame, (0, line_y), (width, line_y), (0, 0, 255), 2)
    cv2.putText(frame, f"Up: {count_up}  Down: {count_down}",
                (10, 40), cv2.FONT_HERSHEY_SIMPLEX, 1.2, (255, 255, 255), 3)
    cv2.putText(frame, f"Frame: {frame_num}", (10, 80),
                cv2.FONT_HERSHEY_SIMPLEX, 0.7, (200, 200, 200), 2)

    out.write(frame)

cap.release()
out.release()

# --- Save counts ---
counts = {
    "up": count_up,
    "down": count_down,
    "total": count_up + count_down,
    "frames_processed": frame_num
}
with open("vehicle_counts.json", "w") as f:
    json.dump(counts, f, indent=2)

print(f"Done. Total vehicles: {counts['total']} (Up: {count_up}, Down: {count_down})")
print(f"Annotated video saved to output_counted.mp4")
print(f"Counts saved to vehicle_counts.json")

A few things worth noting:

  • counted_ids prevents double-counting. Once a vehicle crosses the line, its ID goes into the set and won’t trigger again even if the centroid wobbles near the line.
  • verbose=False on the YOLO call suppresses per-frame logging, which would flood your terminal on a long video.
  • The output video gets bounding-box-free annotations – just centroids, IDs, and counts. If you want bounding boxes too, draw them from the boxes list with cv2.rectangle.

For a 1080p video at 30 FPS, the yolov8n model processes around 40-80 FPS on a decent GPU. On CPU, expect 5-15 FPS. If speed matters more than accuracy, stick with yolov8n.pt. If you need better detection on small or distant vehicles, use yolov8s.pt or yolov8m.pt at the cost of throughput.

Common Errors and Fixes

ModuleNotFoundError: No module named 'ultralytics'

You need to install the Ultralytics package. Run pip install ultralytics. If you’re in a conda environment, make sure you’re installing into the right one.

Video file won’t open / ret is always False

OpenCV needs the right codec to read your video. Install system-level ffmpeg:

1
2
3
4
5
# Ubuntu/Debian
sudo apt install ffmpeg

# macOS
brew install ffmpeg

Then reinstall opencv-python: pip install --force-reinstall opencv-python.

Counts are way too high (double/triple counting)

Your distance threshold in the centroid tracker is too high, causing one vehicle to register as multiple objects. Lower the threshold from 80 to 40-50, or increase max_disappeared so tracks don’t drop and re-register between frames.

Counts are too low (missing vehicles)

The YOLO confidence threshold might be too high. By default it’s 0.25, which is reasonable, but for distant or partially occluded vehicles you can lower it:

1
results = model(frame, classes=vehicle_classes, conf=0.15, verbose=False)

Be careful going below 0.1 – you’ll start picking up false positives.

Tracker IDs keep resetting / jumping

This happens when max_disappeared is too low for your video’s FPS. If a vehicle is briefly occluded (behind a pole, another vehicle), the tracker deregisters it and assigns a new ID when it reappears. Bump max_disappeared to 50 or higher.

Output video is 0 bytes or won’t play

Make sure you call out.release() after the loop. Also, mp4v codec sometimes produces files that certain players can’t handle. Try XVID as an alternative:

1
2
fourcc = cv2.VideoWriter_fourcc(*"XVID")
out = cv2.VideoWriter("output_counted.avi", fourcc, fps, (width, height))