AI Tools and APIs

How to Set Up a Local MCP Server to Give Your LLM Access to Custom Tools

Stand up a working MCP server from scratch with Python’s FastMCP SDK, define custom tools with type hints, and connect it to Claude Desktop in under 30 minutes.

How to Build AI Apps with the Vercel AI SDK and Next.js

Create production-ready AI chat apps with streaming responses, tool calling, and provider switching using the Vercel AI SDK

How to Build AI Search Pipelines with Haystack

Step-by-step guide to building production-ready AI search apps with Haystack’s component-based pipeline architecture.

How to Build Interactive AI Demos with Gradio

Wrap any Python ML model in a web UI with Gradio and deploy it to Hugging Face Spaces in minutes

How to Build Serverless AI Workflows with Modal

Skip the DevOps headache. Deploy production-ready AI models with automatic GPU scaling, no Kubernetes required.

How to Run Fast LLM Inference with the Groq API

Get sub-second LLM responses from Groq’s API with its OpenAI-compatible interface and custom AI accelerator chips

How to Run Open-Source Models with the Replicate API

Use the Replicate API to run open-source AI models in the cloud with simple Python calls and pay per second

How to Use Amazon Bedrock for Foundation Model APIs

Call foundation models on AWS Bedrock using Python, with examples for inference, streaming, and RAG.

How to Use the Anthropic Batch API for High-Volume Processing

Cut your Claude API costs in half by batching requests with the Anthropic Message Batches API

How to Use the Anthropic Citations API for Grounded Responses

Get Claude to cite its sources with exact quotes using the Anthropic citations API in Python

How to Use the Anthropic Claude Files API for Large Document Processing

Stop re-uploading documents every request – use the Files API to upload once and reference by file ID

How to Use the Anthropic Claude Vision API for Image Understanding

Send images to Claude for analysis, OCR, and visual Q&A using the Anthropic Python SDK with base64 and URL inputs

How to Use the Anthropic Extended Thinking API for Complex Reasoning

Get Claude to show its work on hard problems using the extended thinking API with budget tokens

How to Use the Anthropic Message Batches API for Async Workloads

Run high-volume Claude workloads at half price with the Message Batches API — batch creation, polling, and result parsing.

How to Use the Anthropic Multi-Turn Conversation API with Tool Use

Wire up function calling across multiple turns with Claude, handling tool_use and tool_result messages

How to Use the Anthropic PDF Processing API for Document Analysis

Extract data, summarize, and answer questions about PDFs using Claude’s built-in document processing

How to Use the Anthropic Prompt Caching API with Context Blocks

Cache system prompts, documents, and tool definitions with Anthropic’s prompt caching to slash latency and API costs

How to Use the Anthropic Token Counting API for Cost Estimation

Pre-calculate token counts and costs for Claude API calls to manage your budget effectively

How to Use the Anthropic Token Efficient Tool Use API

Reduce Claude API costs for tool-heavy agents using token-efficient tool mode and schema caching.

How to Use the Anthropic Token Streaming API for Real-Time UIs

Stream Claude responses token-by-token into your UI with the Anthropic Python SDK streaming methods

How to Use the Anthropic Tool Use API for Agentic Workflows

Wire up Claude’s tool use to build agents that call functions, chain tools together, and handle multi-step tasks

How to Use the AWS Bedrock Converse API for Multi-Model Chat

Build multi-model chat apps with a single API using AWS Bedrock Converse for Claude, Llama, and Mistral

How to Use the Cerebras API for Fast LLM Inference

Get blazing-fast LLM inference from Cerebras hardware using their OpenAI-compatible Python API

How to Use the Cohere Rerank API for Search Quality

Boost your search pipeline accuracy by reranking results with Cohere’s cross-encoder rerank models

How to Use the DeepSeek API for Code and Reasoning Tasks

Call DeepSeek’s R1 and V3 models for code and reasoning through their OpenAI-compatible API in Python.

How to Use the Fireworks AI API for Fast Open-Source LLMs

Use the Fireworks AI API to get fast inference from Llama 3.1 70B, Mixtral, and other open models via the OpenAI Python SDK.

How to Use the Google Vertex AI Gemini API for Multimodal Tasks

Send text, images, videos, and PDFs to Gemini via the Vertex AI SDK with structured JSON output and GCP authentication

How to Use the OpenAI Realtime API for Voice Applications

Connect to OpenAI’s Realtime API over WebSockets for real-time voice input, text output, and function calling

How to Use the OpenRouter API for Multi-Provider LLM Access

Call 400+ LLMs from every major provider through a single API key and endpoint with OpenRouter.

How to Use the Perplexity API for AI-Powered Search

Build AI-powered search into your apps using Perplexity’s Sonar models and the OpenAI SDK.

How to Use the Stability AI API for Image and Video Generation

Call Stability AI’s hosted API to generate, edit, and upscale images without managing GPU infrastructure

How to Use the Together AI API for Open-Source LLMs

Deploy open-source LLMs like Llama 3 and Mixtral at scale using Together AI’s fast inference API with Python examples.

How to Use the Voyage AI API for Code and Text Embeddings

Embed text and code with Voyage AI’s Python SDK to build fast similarity search and code retrieval tools

How to Use the Weights and Biases Prompts API for LLM Tracing

Set up W&B Weave to log every LLM call, track prompt changes, and visualize token costs across experiments

How to Use the xAI Grok API for Chat and Function Calling

Connect to xAI’s Grok models using the OpenAI SDK with custom base URL for chat and tool use

How to Build AI Assistants with the Cohere API

Use Cohere’s chat, embed, rerank, and classify endpoints to build assistants with built-in citation grounding and tool use.

How to Build AI Workflows with LangChain Expression Language

Chain prompts, models, and parsers into clean AI workflows using LCEL’s pipe syntax and built-in streaming

How to Build Apps with the Gemini API and Python SDK

Build production apps with Gemini using the Python SDK for text generation, multimodal input, tool use, and streaming

How to Run Models with the Hugging Face Inference API

Run open-source models through Hugging Face Inference Providers: chat completion, image generation, streaming, and error handling

How to Track ML Experiments with Weights and Biases

Set up wandb experiment tracking with real code for logging, hyperparameter sweeps, artifacts, and framework integrations

How to Use Claude's Model Context Protocol (MCP)

Step-by-step guide to building your first MCP server in Python and wiring it into Claude Desktop

How to Use LiteLLM as a Universal LLM API Gateway

Route requests across OpenAI, Anthropic, Azure, and dozens more providers using a single Python SDK and proxy server.

How to Use the Anthropic Python SDK for Claude

Learn the Anthropic Python SDK basics: messages, streaming, system prompts, and error handling

How to Use the Mistral API for Code Generation and Chat

Set up the Mistral Python SDK for chat completions, Codestral, function calling, and streaming

How to Use the OpenAI Agents SDK

Create AI agents that hand off tasks, call tools, and run with built-in tracing using OpenAI’s production agent framework.