Stability AI gives you hosted access to Stable Diffusion models through a straightforward REST API. You send a prompt, get an image back. No GPU provisioning, no model weight downloads, no CUDA driver headaches. The v2beta API supports text-to-image, image-to-image, and upscaling – all through multipart form-data POST requests.
Here’s the fastest path to generating your first image:
| |
| |
Set your API key as an environment variable (export STABILITY_API_KEY=sk-your-key-here) and run that script. You’ll have a generated image on disk in a few seconds.
Authentication and Setup
Sign up at platform.stability.ai and grab your API key from the dashboard. New accounts get 25 free credits. Every request uses credits based on the endpoint – Stable Image Core costs roughly $0.03 per image, Stable Image Ultra costs about $0.08.
Every request to the API needs two headers:
| |
The accept header controls the response format. Set it to image/* and you get raw image bytes back. Set it to application/json and you get a JSON response with a base64-encoded image string. I recommend image/* for simplicity – you write the response content directly to a file and you’re done.
The API uses multipart/form-data for all requests, even text-to-image where you’re not uploading a file. That’s why you’ll see files={"none": ""} in the requests – it forces the requests library to encode the payload as multipart form-data.
Text-to-Image Generation
The v2beta API offers three text-to-image endpoints, each backed by a different model tier:
| Endpoint | Model | Best For | Cost |
|---|---|---|---|
/v2beta/stable-image/generate/core | Stable Image Core | Fast drafts, prototyping | ~$0.03 |
/v2beta/stable-image/generate/sd3 | SD3.5 | High quality, prompt adherence | ~$0.035 |
/v2beta/stable-image/generate/ultra | Stable Image Ultra | Maximum quality | ~$0.08 |
My recommendation: start with core for iteration, switch to sd3 or ultra for final outputs.
Here’s a more complete text-to-image example with all the useful parameters:
| |
Key Parameters
- prompt (required): Your text description. Be specific – “a red sports car on a mountain road at sunset” works better than “car.”
- negative_prompt (optional): Things you want excluded. Helps avoid common artifacts.
- aspect_ratio: Options include
1:1,16:9,9:16,2:3,3:2,4:5,5:4,21:9,9:21. No arbitrary pixel dimensions – you pick a ratio. - output_format:
png,jpeg, orwebp. - seed: Fix this for reproducible results. Same seed + same prompt = same image.
- style_preset (Core/Ultra only): Values like
photographic,anime,cinematic,digital-art,comic-book, and more. Useful when you want a consistent style without crafting a complex prompt.
Image-to-Image Generation
Image-to-image lets you send an existing image along with a prompt to generate variations or restyle it. The SD3 endpoint supports this by setting the mode parameter to image-to-image and providing an image file and a strength value.
The strength parameter controls how much the output diverges from the input. At 0.0, the result is nearly identical to your input. At 1.0, the model ignores the input almost entirely. Values between 0.3 and 0.6 tend to produce the best results for style transfer.
| |
A few things to note about image-to-image:
- The input image must be at least 64x64 pixels. JPEG, PNG, and WebP are all accepted.
- You pass the file via the
filesparameter in therequestscall, which means you don’t need thefiles={"none": ""}trick anymore – the real file handles the multipart encoding. - Lower
strengthvalues preserve more of the original composition. Higher values give the model more creative freedom.
Image Upscaling
Stability AI offers a conservative upscale endpoint that takes images up to 4K resolution. It scales images by roughly 20-40x while preserving detail. The endpoint lives at /v2beta/stable-image/upscale/conservative.
This is genuinely useful for taking AI-generated images (which are often 1024x1024) and scaling them up for print or high-resolution displays without the usual blurriness.
| |
Upscale Parameters
- image (required): The input image file, between 64x64 and 1 megapixel.
- prompt (required): A short description of what’s in the image. This helps the model add appropriate detail during upscaling.
- creativity (optional): A float that controls how much new detail the model adds. Lower values (around
0.2-0.35) keep the output faithful to the original. Higher values introduce more generated detail. - negative_prompt (optional): Elements to avoid during upscaling.
- seed (optional): For reproducible results.
Handling Response Formats
You have two ways to receive images from the API, controlled by the accept header.
Raw Binary (Recommended)
Set accept: image/* and write response.content directly to a file. This is the simplest approach:
| |
The response headers include useful metadata. Check response.headers.get("finish-reason") to confirm the generation completed successfully (value will be SUCCESS). A value of CONTENT_FILTERED means the safety filter caught something.
Base64 JSON
Set accept: application/json to get a JSON response with the image encoded as a base64 string:
| |
The JSON format is handy when you need the metadata (seed, finish reason) alongside the image data in a single structured response. It’s also easier to pass around in web applications where you might embed the base64 string directly in an <img> tag.
Common Errors and Fixes
401 Unauthorized
| |
Your API key is wrong or missing. Double-check the environment variable is set and starts with sk-. The header must use Bearer prefix: "authorization": "Bearer sk-...".
400 Bad Request – Invalid aspect ratio
| |
You sent an aspect ratio the endpoint doesn’t support. Stick to: 1:1, 16:9, 9:16, 2:3, 3:2, 4:5, 5:4, 21:9, 9:21.
402 Payment Required
| |
Your account ran out of credits. Buy more at platform.stability.ai. Check your balance before running batch jobs.
413 Payload Too Large
| |
Your input image is too large. For image-to-image, stay under 10MB per file. Resize before sending.
Content Filtered (finish_reason: CONTENT_FILTERED) The safety filter flagged your prompt or the generated output. Rephrase your prompt to avoid policy-violating content. This isn’t an HTTP error – you still get a 200 response, but the image may be blank or missing.
422 Validation Error – Missing required fields
Make sure you’re sending the request as multipart/form-data. If you forget the files parameter in the requests call, the library sends it as application/x-www-form-urlencoded and the API rejects it. Use files={"none": ""} when you don’t have an actual file to upload.
Related Guides
- How to Run Open-Source Models with the Replicate API
- How to Use the Cerebras API for Fast LLM Inference
- How to Use the Anthropic Prompt Caching API with Context Blocks
- How to Use the Anthropic Tool Use API for Agentic Workflows
- How to Use the Cohere Rerank API for Search Quality
- How to Use the AWS Bedrock Converse API for Multi-Model Chat
- How to Use the OpenAI Realtime API for Voice Applications
- How to Use the Weights and Biases Prompts API for LLM Tracing
- How to Run Fast LLM Inference with the Groq API
- How to Use the Together AI API for Open-Source LLMs