How to Use the Stability AI API for Image and Video Generation

Stability AI gives you hosted access to Stable Diffusion models through a straightforward REST API. You send a prompt, get an image back. No GPU provisioning, no model weight downloads, no CUDA driver headaches. The v2beta API supports text-to-image, image-to-image, and upscaling – all through multipart form-data POST requests.

Here’s the fastest path to generating your first image:

1
pip install requests

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
import os
import requests

API_KEY = os.environ["STABILITY_API_KEY"]

response = requests.post(
    "https://api.stability.ai/v2beta/stable-image/generate/core",
    headers={
        "authorization": f"Bearer {API_KEY}",
        "accept": "image/*",
    },
    files={"none": ""},
    data={
        "prompt": "A cyberpunk city skyline at sunset, neon lights reflecting on wet streets",
        "output_format": "png",
    },
)

if response.status_code == 200:
    with open("output.png", "wb") as f:
        f.write(response.content)
    print("Image saved to output.png")
else:
    raise Exception(f"Error {response.status_code}: {response.json()}")

Set your API key as an environment variable (export STABILITY_API_KEY=sk-your-key-here) and run that script. You’ll have a generated image on disk in a few seconds.

Authentication and Setup

Sign up at platform.stability.ai and grab your API key from the dashboard. New accounts get 25 free credits. Every request uses credits based on the endpoint – Stable Image Core costs roughly $0.03 per image, Stable Image Ultra costs about $0.08.

Every request to the API needs two headers:

1
2
3
4
headers = {
    "authorization": f"Bearer {os.environ['STABILITY_API_KEY']}",
    "accept": "image/*",  # or "application/json" for base64 response
}

The accept header controls the response format. Set it to image/* and you get raw image bytes back. Set it to application/json and you get a JSON response with a base64-encoded image string. I recommend image/* for simplicity – you write the response content directly to a file and you’re done.

The API uses multipart/form-data for all requests, even text-to-image where you’re not uploading a file. That’s why you’ll see files={"none": ""} in the requests – it forces the requests library to encode the payload as multipart form-data.

Text-to-Image Generation

The v2beta API offers three text-to-image endpoints, each backed by a different model tier:

Endpoint	Model	Best For	Cost
`/v2beta/stable-image/generate/core`	Stable Image Core	Fast drafts, prototyping	~$0.03
`/v2beta/stable-image/generate/sd3`	SD3.5	High quality, prompt adherence	~$0.035
`/v2beta/stable-image/generate/ultra`	Stable Image Ultra	Maximum quality	~$0.08

My recommendation: start with core for iteration, switch to sd3 or ultra for final outputs.

Here’s a more complete text-to-image example with all the useful parameters:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
import os
import requests

API_KEY = os.environ["STABILITY_API_KEY"]

response = requests.post(
    "https://api.stability.ai/v2beta/stable-image/generate/sd3",
    headers={
        "authorization": f"Bearer {API_KEY}",
        "accept": "image/*",
    },
    files={"none": ""},
    data={
        "prompt": "A photorealistic golden retriever sitting in a field of wildflowers, shallow depth of field, golden hour lighting",
        "negative_prompt": "blurry, low quality, distorted, watermark",
        "aspect_ratio": "16:9",
        "output_format": "png",
        "seed": 42,
    },
)

if response.status_code == 200:
    with open("golden_retriever.png", "wb") as f:
        f.write(response.content)
    print(f"Saved. Seed: {response.headers.get('seed', 'unknown')}")
else:
    print(f"Error {response.status_code}: {response.json()}")

Key Parameters

prompt (required): Your text description. Be specific – “a red sports car on a mountain road at sunset” works better than “car.”
negative_prompt (optional): Things you want excluded. Helps avoid common artifacts.
aspect_ratio: Options include 1:1, 16:9, 9:16, 2:3, 3:2, 4:5, 5:4, 21:9, 9:21. No arbitrary pixel dimensions – you pick a ratio.
output_format: png, jpeg, or webp.
seed: Fix this for reproducible results. Same seed + same prompt = same image.
style_preset (Core/Ultra only): Values like photographic, anime, cinematic, digital-art, comic-book, and more. Useful when you want a consistent style without crafting a complex prompt.

Image-to-Image Generation

Image-to-image lets you send an existing image along with a prompt to generate variations or restyle it. The SD3 endpoint supports this by setting the mode parameter to image-to-image and providing an image file and a strength value.

The strength parameter controls how much the output diverges from the input. At 0.0, the result is nearly identical to your input. At 1.0, the model ignores the input almost entirely. Values between 0.3 and 0.6 tend to produce the best results for style transfer.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
import os
import requests

API_KEY = os.environ["STABILITY_API_KEY"]

with open("input_photo.png", "rb") as img_file:
    response = requests.post(
        "https://api.stability.ai/v2beta/stable-image/generate/sd3",
        headers={
            "authorization": f"Bearer {API_KEY}",
            "accept": "image/*",
        },
        files={
            "image": ("input_photo.png", img_file, "image/png"),
        },
        data={
            "prompt": "A watercolor painting of the same scene, soft brushstrokes, muted colors",
            "mode": "image-to-image",
            "strength": 0.45,
            "output_format": "png",
        },
    )

if response.status_code == 200:
    with open("watercolor_output.png", "wb") as f:
        f.write(response.content)
    print("Style transfer complete.")
else:
    print(f"Error {response.status_code}: {response.json()}")

A few things to note about image-to-image:

The input image must be at least 64x64 pixels. JPEG, PNG, and WebP are all accepted.
You pass the file via the files parameter in the requests call, which means you don’t need the files={"none": ""} trick anymore – the real file handles the multipart encoding.
Lower strength values preserve more of the original composition. Higher values give the model more creative freedom.

Image Upscaling

Stability AI offers a conservative upscale endpoint that takes images up to 4K resolution. It scales images by roughly 20-40x while preserving detail. The endpoint lives at /v2beta/stable-image/upscale/conservative.

This is genuinely useful for taking AI-generated images (which are often 1024x1024) and scaling them up for print or high-resolution displays without the usual blurriness.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
import os
import requests

API_KEY = os.environ["STABILITY_API_KEY"]

with open("small_image.png", "rb") as img_file:
    response = requests.post(
        "https://api.stability.ai/v2beta/stable-image/upscale/conservative",
        headers={
            "authorization": f"Bearer {API_KEY}",
            "accept": "image/*",
        },
        files={
            "image": ("small_image.png", img_file, "image/png"),
        },
        data={
            "prompt": "high resolution detailed photograph",
            "output_format": "png",
            "creativity": 0.3,
        },
    )

if response.status_code == 200:
    with open("upscaled_output.png", "wb") as f:
        f.write(response.content)
    print("Upscaled image saved.")
else:
    print(f"Error {response.status_code}: {response.json()}")

Upscale Parameters

image (required): The input image file, between 64x64 and 1 megapixel.
prompt (required): A short description of what’s in the image. This helps the model add appropriate detail during upscaling.
creativity (optional): A float that controls how much new detail the model adds. Lower values (around 0.2-0.35) keep the output faithful to the original. Higher values introduce more generated detail.
negative_prompt (optional): Elements to avoid during upscaling.
seed (optional): For reproducible results.

Handling Response Formats

You have two ways to receive images from the API, controlled by the accept header.

Raw Binary (Recommended)

Set accept: image/* and write response.content directly to a file. This is the simplest approach:

1
2
3
4
5
response = requests.post(url, headers={"accept": "image/*", ...}, ...)

if response.status_code == 200:
    with open("output.png", "wb") as f:
        f.write(response.content)

The response headers include useful metadata. Check response.headers.get("finish-reason") to confirm the generation completed successfully (value will be SUCCESS). A value of CONTENT_FILTERED means the safety filter caught something.

Base64 JSON

Set accept: application/json to get a JSON response with the image encoded as a base64 string:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
import base64

response = requests.post(
    "https://api.stability.ai/v2beta/stable-image/generate/core",
    headers={
        "authorization": f"Bearer {os.environ['STABILITY_API_KEY']}",
        "accept": "application/json",
    },
    files={"none": ""},
    data={
        "prompt": "A minimalist logo design, flat vector style, blue and white",
        "output_format": "png",
    },
)

if response.status_code == 200:
    data = response.json()
    image_bytes = base64.b64decode(data["image"])
    with open("logo.png", "wb") as f:
        f.write(image_bytes)
    print(f"Finish reason: {data.get('finish_reason')}")
    print(f"Seed: {data.get('seed')}")
else:
    print(f"Error: {response.json()}")

The JSON format is handy when you need the metadata (seed, finish reason) alongside the image data in a single structured response. It’s also easier to pass around in web applications where you might embed the base64 string directly in an <img> tag.

Common Errors and Fixes

401 Unauthorized

1
{"name": "unauthorized", "message": "Incorrect API key provided"}

Your API key is wrong or missing. Double-check the environment variable is set and starts with sk-. The header must use Bearer prefix: "authorization": "Bearer sk-...".

400 Bad Request – Invalid aspect ratio

1
{"name": "bad_request", "errors": ["Invalid value for aspect_ratio"]}

You sent an aspect ratio the endpoint doesn’t support. Stick to: 1:1, 16:9, 9:16, 2:3, 3:2, 4:5, 5:4, 21:9, 9:21.

402 Payment Required

1
{"name": "payment_required", "message": "Insufficient credits"}

Your account ran out of credits. Buy more at platform.stability.ai. Check your balance before running batch jobs.

413 Payload Too Large

1
{"name": "payload_too_large", "message": "Image exceeds maximum size"}

Your input image is too large. For image-to-image, stay under 10MB per file. Resize before sending.

Content Filtered (finish_reason: CONTENT_FILTERED) The safety filter flagged your prompt or the generated output. Rephrase your prompt to avoid policy-violating content. This isn’t an HTTP error – you still get a 200 response, but the image may be blank or missing.

422 Validation Error – Missing required fields Make sure you’re sending the request as multipart/form-data. If you forget the files parameter in the requests call, the library sends it as application/x-www-form-urlencoded and the API rejects it. Use files={"none": ""} when you don’t have an actual file to upload.

Authentication and Setup#

Text-to-Image Generation#

Key Parameters#

Image-to-Image Generation#

Image Upscaling#

Upscale Parameters#

Handling Response Formats#

Raw Binary (Recommended)#

Base64 JSON#

Common Errors and Fixes#

Related Guides#

About the Author