Architectural floor plan generation used to require expensive CAD software and hours of drafting. Now you can go from a rough sketch or a text description to a clean floor plan in seconds using Stable Diffusion XL and ControlNet. The results aren’t production-ready blueprints, but they’re excellent for rapid prototyping, client presentations, and exploring layout ideas before committing to detailed design work.
Text-to-Floor-Plan with Stable Diffusion XL
The fastest way to generate a floor plan is a well-crafted prompt fed into SDXL. Architecture-specific language matters here – generic prompts produce blurry results. You want to describe the plan as a technical drawing, not a photo.
| |
The key to good results is the negative prompt. Without it, SDXL tends to generate 3D renders or perspective views instead of clean top-down plans. The guidance_scale of 7.5 keeps the model close to your prompt without over-saturating the output.
Conditioning on Sketches with ControlNet
Text prompts alone give you limited control over room placement. If you have a rough sketch – even a hand-drawn one on paper – ControlNet with Canny edge detection turns it into a structured floor plan while preserving your layout.
| |
The controlnet_conditioning_scale parameter controls how strictly the model follows your sketch. At 0.8, the model respects the overall layout but smooths out rough edges. Drop it to 0.5 if you want more creative freedom, or push it to 1.0 for near-exact reproduction of your lines.
Post-Processing with OpenCV
Raw generated floor plans usually need cleanup. Wall lines might be uneven, rooms might lack labels, and there’s often noise in the background. OpenCV handles all of this.
| |
The contour area filter (5000 < area < 200000) is critical. Without it, you’ll label every tiny artifact as a room. Adjust these thresholds based on your image resolution – larger images need larger minimum areas.
Generating Variations and Batch Configurations
Once you have a base prompt that works, you’ll want to explore variations. Different room counts, apartment sizes, and architectural styles. Batch generation with a seed strategy gives you reproducible results you can compare side by side.
| |
Using the same seed (42) across configurations keeps the overall structure consistent while the prompt drives the layout differences. If you want completely different layouts for the same room count, change the seed between runs. Setting num_images_per_prompt=3 gives you three variations per configuration to pick from.
For memory-constrained setups, add pipe.enable_model_cpu_offload() right after loading the pipeline. It moves model components between GPU and CPU as needed, cutting VRAM usage roughly in half at the cost of slower inference.
Common Errors and Fixes
RuntimeError: Expected all tensors to be on the same device – This happens when the control image and model are on different devices. Make sure your control image is a PIL Image (not a tensor) and let the pipeline handle the conversion. Don’t manually move it to CUDA.
OutOfMemoryError on 8GB GPUs – SDXL is hungry. Add pipe.enable_model_cpu_offload() after loading the pipeline. If that’s still not enough, use pipe.enable_sequential_cpu_offload() which is slower but uses minimal VRAM. You can also drop resolution to 768x768, though floor plan quality suffers.
Blurry or incoherent floor plans – Your prompt likely isn’t specific enough. Always include “top-down view”, “black lines on white background”, and “architectural floor plan” together. Adding “technical drawing” and “blueprint” pushes the model toward clean line art instead of artistic interpretations.
ControlNet ignoring your sketch – If the generated plan doesn’t match your input sketch at all, check two things. First, make sure your Canny edge thresholds actually produce visible edges – try cv2.imwrite("edges_debug.png", edges) to inspect. Second, increase controlnet_conditioning_scale toward 1.0. Below 0.5, the model mostly ignores the conditioning.
Rooms not detected in post-processing – The contour detection depends on closed boundaries. If AI-generated walls have gaps, the contour finder won’t see separate rooms. Fix this by increasing the MORPH_CLOSE iterations to 3 or 4, which closes small gaps in wall lines before contour detection runs.
ValueError: Input image must be 1024x1024 – SDXL models expect specific resolutions. Resize your control image to match the pipeline’s output size. Use control_image = control_image.resize((1024, 1024)) before passing it to the pipeline.
Related Guides
- How to Generate Images with Stable Diffusion in Python
- How to Build AI Clothing Try-On with Virtual Diffusion Models
- How to Control Image Generation with ControlNet and IP-Adapter
- How to Build AI Sticker and Emoji Generation with Stable Diffusion
- How to Build AI Scene Generation with Layered Diffusion
- How to Build AI Wallpaper Generation with Stable Diffusion and Tiling
- How to Build AI Architectural Rendering with ControlNet and Stable Diffusion
- How to Build AI Sketch-to-Image Generation with ControlNet Scribble
- How to Generate Textures and Materials with AI for 3D Assets
- How to Generate Images in Real Time with Latent Consistency Models