How to Fine-Tune LLMs with DPO and RLHF
Align your LLM with preference data using DPOTrainer – simpler and more stable than PPO
Align your LLM with preference data using DPOTrainer – simpler and more stable than PPO
Train your own LLM adapter on a single GPU with Unsloth, LoRA, and a custom dataset
Train custom subjects and styles into Stable Diffusion on a single GPU with LoRA adapters
Create 3D meshes in OBJ and GLB formats from text or a single photo using open-source and API-based tools
Create music, sound effects, and edit audio clips using diffusion models on your own GPU
Run Black Forest Labs’ FLUX.2 models locally to create images from text prompts on your own hardware
Run Stable Diffusion on your own GPU to create images from text prompts with full control
Use MusicGen to create music from text prompts and melodies on your own GPU with full parameter control
Turn any image into a short video clip with SVD on your own GPU using complete Python code
Train better models with fewer labels using uncertainty sampling, query strategies, and pool-based loops
Add safety guardrails to your AI application with input validation and output filtering
Get faster time-to-first-token by streaming from OpenAI, Anthropic, and your own FastAPI proxy with working code.
Build production-ready topic models that actually surface meaningful themes from your text data using BERTopic
See exactly which tokens your model focuses on and build explainability reports for stakeholders
Build an LLM-powered annotation pipeline that cuts labeling time and cost dramatically
Find your LLM API’s breaking point before your users do by running realistic load tests with Locust
Track every LLM call, measure quality with evaluations, and catch regressions before users notice them
Convert your models to ONNX format and run them faster on CPU or GPU with fewer dependencies
Find and fix GPU memory bottlenecks so you can train larger models on the hardware you already have
Run 70B models on a single GPU by quantizing with AutoGPTQ and AutoAWQ in Python
Find prompt injection, jailbreak, and data extraction vulnerabilities in your AI app with red-teaming tools
Set up local LLM inference in minutes with Ollama or build llama.cpp from source for full control over quantization and GPU layers
Run open-source models through Hugging Face Inference Providers: chat completion, image generation, streaming, and error handling
Go from single-node training to distributed multi-GPU pipelines with Ray’s unified ML framework
Use SAM 2 to cut out objects from images with clicks or bounding boxes in a few lines of Python
Get an SGLang server running, send requests via the OpenAI SDK, and fix the errors you’ll actually hit
Set up vLLM to serve open-source LLMs with an OpenAI-compatible API endpoint
Automate model training, testing, and deployment using GitHub Actions workflows with CML and DVC
Train 7B+ parameter models across multiple GPUs using DeepSpeed ZeRO stages, mixed precision, and CPU offloading.
Go from single-GPU to multi-GPU training with PyTorch DDP in under 50 lines of code
Cut your LLM latency in half by letting a small draft model propose tokens that the big model verifies in parallel.
Break long documents into chunks, summarize each one in parallel, and combine the results into a final summary.
Find and fix agent loops, broken tool calls, and prompt regressions using LangSmith’s trace UI and evaluation SDK.
Set up wandb experiment tracking with real code for logging, hyperparameter sweeps, artifacts, and framework integrations
Track and re-identify objects across video frames with ByteTrack, YOLO detection, and the supervision library
Step-by-step guide to training your own YOLO object detection model on custom data using the Ultralytics Python API and CLI.
Add formal privacy guarantees to your deep learning pipeline with three lines of Opacus code and smart epsilon budgeting
Step-by-step guide to building your first MCP server in Python and wiring it into Claude Desktop
Wire up LLM-powered tool use in Python across both OpenAI and Claude, with real code for parallel and forced calls.
Stop wrestling with malformed JSON. Use GPT-5.2’s structured outputs to enforce schemas at the token level.
Route requests across OpenAI, Anthropic, Azure, and dozens more providers using a single Python SDK and proxy server.
Set up prompt caching for Claude and GPT APIs to slash input token costs and speed up response times.
Replace hand-rolled attention kernels with FlexAttention and get up to 2x faster LLM decoding on long contexts.
Learn the Anthropic Python SDK basics: messages, streaming, system prompts, and error handling
Set up the Mistral Python SDK for chat completions, Codestral, function calling, and streaming
Create AI agents that hand off tasks, call tools, and run with built-in tracing using OpenAI’s production agent framework.
Set up automated data quality checks for ML datasets with GX expectation suites and checkpoints
Go from trained model to versioned, aliased, and served artifact using MLflow’s Python SDK
Set up DVC to version datasets, switch between data snapshots, and build reproducible ML pipelines
Write better system prompts that get consistent, high-quality results from any large language model