Strip the Background in Three Lines

The fastest path from “photo with background” to “photo without background” is rembg. It wraps the U2-Net model and handles all the preprocessing you would otherwise do manually.

1
pip install rembg[gpu] pillow
1
2
3
4
5
6
from rembg import remove
from PIL import Image

img = Image.open("product.jpg")
result = remove(img)
result.save("product_no_bg.png")

That gives you a PNG with a transparent background. The remove() function runs U2-Net under the hood, which is specifically trained for salient object detection – it figures out what the “main thing” in the image is and cuts everything else away. For product photos, headshots, and clean compositions, it works surprisingly well out of the box.

If you want a white background instead of transparent:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
from rembg import remove
from PIL import Image

img = Image.open("product.jpg")
no_bg = remove(img)

# Composite onto a white background
white_bg = Image.new("RGBA", no_bg.size, (255, 255, 255, 255))
white_bg.paste(no_bg, mask=no_bg.split()[3])
white_bg.convert("RGB").save("product_white_bg.jpg")

Pick Your Model

rembg ships with several models. The default (u2net) handles most cases, but you can swap models depending on the use case:

ModelBest ForSpeed
u2netGeneral purpose (default)Medium
u2net_human_segPeople / portraitsMedium
isnet-general-useHigher quality edgesSlower
siluetaLightweight, fastFast
1
2
3
4
from rembg import remove, new_session

session = new_session("u2net_human_seg")
result = remove(img, session=session)

The u2net_human_seg model is noticeably better at hair detail when you are working with portrait photos. For e-commerce product shots where speed matters more than edge perfection, silueta is the move.

Precise Segmentation with SAM

When rembg struggles – complex scenes, multiple objects, or cases where you need to keep a specific subject – use Meta’s Segment Anything Model to generate a more precise mask.

First, install SAM 2 and download the checkpoint:

1
2
3
pip install sam-2
mkdir -p checkpoints
wget -P checkpoints https://dl.fbaipublicfiles.com/segment_anything_2/092824/sam2.1_hiera_large.pt
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
import numpy as np
import torch
from PIL import Image
from sam2.build_sam import build_sam2
from sam2.sam2_image_predictor import SAM2ImagePredictor

checkpoint = "./checkpoints/sam2.1_hiera_large.pt"
model_cfg = "configs/sam2.1/sam2.1_hiera_l.yaml"
predictor = SAM2ImagePredictor(build_sam2(model_cfg, checkpoint))

image = np.array(Image.open("scene.jpg"))

with torch.inference_mode(), torch.autocast("cuda", dtype=torch.bfloat16):
    predictor.set_image(image)

    # Click on the subject you want to keep (replace with your subject's pixel coordinates)
    point_coords = np.array([[400, 300]])
    point_labels = np.array([1])

    masks, scores, _ = predictor.predict(
        point_coords=point_coords,
        point_labels=point_labels,
        multimask_output=True,
    )

best_mask = masks[np.argmax(scores)]

# Apply the mask to extract the subject
pil_image = Image.open("scene.jpg").convert("RGBA")
mask_image = Image.fromarray((best_mask * 255).astype(np.uint8))
pil_image.putalpha(mask_image)
pil_image.save("subject_extracted.png")

SAM gives you pixel-level control. You click on the object, it returns a mask, and you apply that mask as an alpha channel. The result is a cutout you can composite onto anything.

Replace the Background with Stable Diffusion Inpainting

Removing the background is half the job. Replacing it with a generated scene is where things get interesting. Instead of compositing onto a stock photo, you can use inpainting to paint a new background that matches the lighting and style of the subject.

The workflow: take the foreground mask, invert it (so the background becomes the editable region), and let the inpainting model fill it in.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
import torch
from diffusers import AutoPipelineForInpainting
from PIL import Image, ImageOps
from rembg import remove

# Step 1: Get the foreground mask from rembg
original = Image.open("portrait.jpg").convert("RGB").resize((512, 512))
no_bg = remove(original)
alpha = no_bg.split()[3]

# Step 2: Invert the mask — white = background (area to regenerate)
bg_mask = ImageOps.invert(alpha).convert("RGB")

# Step 3: Load inpainting pipeline
pipe = AutoPipelineForInpainting.from_pretrained(
    "stable-diffusion-v1-5/stable-diffusion-inpainting",
    torch_dtype=torch.float16,
    variant="fp16",
)
pipe.enable_model_cpu_offload()

# Step 4: Generate a new background
result = pipe(
    prompt="a tropical beach at sunset, warm golden light, soft waves",
    negative_prompt="blurry, low quality, distorted",
    image=original,
    mask_image=bg_mask,
    num_inference_steps=30,
    guidance_scale=7.5,
).images[0]

result.save("portrait_beach_bg.png")

The inpainting model only touches the masked (white) area, so the subject stays intact. The key is getting the mask right – rembg produces a clean foreground alpha, and inverting it gives the inpainting model the exact region to fill.

Batch Processing Product Photos

If you have a folder of product images that all need clean backgrounds, automate the whole thing:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
from pathlib import Path
from rembg import remove, new_session
from PIL import Image

input_dir = Path("raw_products")
output_dir = Path("clean_products")
output_dir.mkdir(exist_ok=True)

# Reuse the session to avoid reloading the model per image
session = new_session("u2net")

for img_path in input_dir.glob("*.jpg"):
    img = Image.open(img_path)
    result = remove(img, session=session)

    # White background for e-commerce
    white = Image.new("RGBA", result.size, (255, 255, 255, 255))
    white.paste(result, mask=result.split()[3])
    white.convert("RGB").save(output_dir / f"{img_path.stem}.jpg", quality=95)
    print(f"Processed {img_path.name}")

The critical detail here is reusing the new_session() across all images. Without it, rembg reloads the U2-Net weights for every single image, which kills throughput. With the session cached, you are looking at roughly 1-3 seconds per image on a GPU, depending on resolution.

Building a Background Removal API

Wrapping this in a FastAPI endpoint turns it into a service any frontend or pipeline can call:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
from fastapi import FastAPI, UploadFile
from fastapi.responses import StreamingResponse
from rembg import remove, new_session
from PIL import Image
import io

app = FastAPI()
session = new_session("u2net")

@app.post("/remove-background")
async def remove_background(file: UploadFile):
    img = Image.open(file.file)
    result = remove(img, session=session)

    buf = io.BytesIO()
    result.save(buf, format="PNG")
    buf.seek(0)

    return StreamingResponse(buf, media_type="image/png")
1
uvicorn app:app --host 0.0.0.0 --port 8000

Hit it with curl:

1
2
3
curl -X POST http://localhost:8000/remove-background \
  -F "[email protected]" \
  --output result.png

The session is initialized once at module load, so every request reuses the same model. For production, add input validation (file size limits, format checks) and consider running behind Gunicorn with multiple workers if you need to handle concurrent requests.

Common Errors and Fixes

RuntimeError: CUDA out of memory – This happens most often with the inpainting pipeline. Use pipe.enable_model_cpu_offload() instead of pipe.to("cuda"). It moves layers to GPU only when needed and frees them afterward. If you are still hitting limits, resize your input images to 512x512 before processing.

rembg produces jagged edges around hair – Switch to the isnet-general-use model. It handles fine details better than the default U2-Net. You can also pass post_process_mask=True to remove() to smooth the mask edges:

1
result = remove(img, session=session, post_process_mask=True)

Inpainted background has visible seams at the mask boundary – Increase the mask slightly by dilating it before passing to the inpainting pipeline. A few pixels of overlap gives the model room to blend:

1
2
3
4
from PIL import ImageFilter

bg_mask = ImageOps.invert(alpha)
bg_mask = bg_mask.filter(ImageFilter.MaxFilter(size=7))  # Dilate by ~3px

ModuleNotFoundError: No module named 'rembg' after install – If you installed rembg without the GPU extra, you get the CPU-only version. Install with pip install rembg[gpu] for CUDA acceleration. On machines without a GPU, pip install rembg works fine but runs slower.

SAM returns a mask that includes unwanted objects – Add negative point prompts to exclude regions. Set point_labels to 0 for points on objects you want excluded from the mask. Combining a positive point on your subject with negative points on nearby objects dramatically improves accuracy.

Batch processing hangs or gets progressively slower – You are likely accumulating GPU memory. Call torch.cuda.empty_cache() between batches, and make sure you are not storing references to old tensors. Processing in chunks of 20-50 images with a cache clear between chunks keeps memory stable.