Most email automation stops at basic filters. Keyword matching works until it doesn’t — a message from your CEO about “lunch plans” gets the same treatment as one about “Q3 revenue crisis.” An LLM can actually read and understand context, which makes it far better at triage than any rule-based system.
Here’s what we’re building: a Python agent that connects to your inbox over IMAP, reads unread messages, classifies them into priority buckets (urgent, action-required, normal, spam), drafts response suggestions, and moves each email into the right folder. The whole thing runs with imaplib, the email module, and the OpenAI API.
That’s the only external dependency. imaplib and email ship with Python.
Connecting to IMAP and Fetching Unread Emails#
IMAP gives you direct access to mailbox folders. Gmail, Outlook, and most providers support it — you just need to enable IMAP access and, for Gmail, generate an app password.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
| import imaplib
import email
from email.header import decode_header
IMAP_SERVER = "imap.gmail.com"
IMAP_PORT = 993
EMAIL_ACCOUNT = "[email protected]"
EMAIL_PASSWORD = "your-app-password" # Use an app password, not your real password
def connect_to_inbox():
mail = imaplib.IMAP4_SSL(IMAP_SERVER, IMAP_PORT)
mail.login(EMAIL_ACCOUNT, EMAIL_PASSWORD)
mail.select("INBOX")
return mail
def fetch_unread_emails(mail, max_emails=20):
status, message_ids = mail.search(None, "UNSEEN")
if status != "OK":
return []
ids = message_ids[0].split()
ids = ids[:max_emails] # Cap it so you don't blow through API credits
emails = []
for msg_id in ids:
status, msg_data = mail.fetch(msg_id, "(RFC822)")
if status != "OK":
continue
raw_email = msg_data[0][1]
msg = email.message_from_bytes(raw_email)
subject = ""
decoded_parts = decode_header(msg["Subject"] or "")
for part, charset in decoded_parts:
if isinstance(part, bytes):
subject += part.decode(charset or "utf-8", errors="replace")
else:
subject += part
sender = msg.get("From", "")
date = msg.get("Date", "")
body = extract_body(msg)
emails.append({
"id": msg_id,
"subject": subject,
"sender": sender,
"date": date,
"body": body[:3000], # Truncate long emails to save tokens
})
return emails
def extract_body(msg):
if msg.is_multipart():
for part in msg.walk():
content_type = part.get_content_type()
if content_type == "text/plain":
payload = part.get_payload(decode=True)
if payload:
charset = part.get_content_charset() or "utf-8"
return payload.decode(charset, errors="replace")
else:
payload = msg.get_payload(decode=True)
if payload:
charset = msg.get_content_charset() or "utf-8"
return payload.decode(charset, errors="replace")
return ""
|
A few things to note: UNSEEN gives you only unread messages. We cap at 20 to avoid hammering the API on the first run. The body extraction walks multipart MIME and grabs the plain text part — HTML-only emails will need an HTML-to-text step if you want to handle those too.
Classifying Emails with OpenAI#
This is where the LLM earns its keep. We send each email’s subject, sender, and body to GPT-4o-mini and ask it to classify the message. Structured output keeps the response parseable.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
| import json
from openai import OpenAI
client = OpenAI() # Uses OPENAI_API_KEY env var
CLASSIFICATION_PROMPT = """You are an email triage assistant. Classify the following email into exactly one category and one priority level.
Categories: urgent, action-required, normal, spam
Priority: high, medium, low
Respond with valid JSON only, no markdown:
{"category": "...", "priority": "...", "reason": "one sentence explanation"}
Email:
From: {sender}
Subject: {subject}
Date: {date}
Body:
{body}
"""
def classify_email(email_data):
prompt = CLASSIFICATION_PROMPT.format(
sender=email_data["sender"],
subject=email_data["subject"],
date=email_data["date"],
body=email_data["body"],
)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": prompt}],
temperature=0.1, # Low temp for consistent classification
max_tokens=150,
)
raw = response.choices[0].message.content.strip()
try:
result = json.loads(raw)
except json.JSONDecodeError:
result = {"category": "normal", "priority": "medium", "reason": "Failed to parse LLM response"}
return result
|
Using gpt-4o-mini here because classification doesn’t need the full gpt-4o — it’s cheaper and fast enough. The low temperature keeps results deterministic. If the LLM returns garbage JSON (rare but possible), we default to “normal” so nothing gets misrouted silently.
Drafting Response Suggestions#
For emails classified as urgent or action-required, the agent drafts a reply. You don’t want to auto-send these — the agent suggests, a human approves.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
| DRAFT_PROMPT = """Draft a brief, professional reply to this email. Keep it under 100 words. Be direct and helpful.
From: {sender}
Subject: {subject}
Body:
{body}
Classification: {category} ({priority} priority)
Reason: {reason}
Reply:"""
def draft_response(email_data, classification):
if classification["category"] in ("spam", "normal") and classification["priority"] == "low":
return None # No draft needed for spam or low-priority normal mail
prompt = DRAFT_PROMPT.format(
sender=email_data["sender"],
subject=email_data["subject"],
body=email_data["body"],
category=classification["category"],
priority=classification["priority"],
reason=classification["reason"],
)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": prompt}],
temperature=0.7, # Slightly more creative for natural-sounding replies
max_tokens=200,
)
return response.choices[0].message.content.strip()
|
Higher temperature here since you want the draft to sound human, not robotic. The 100-word cap in the prompt keeps replies tight — nobody wants a three-paragraph reply to a scheduling request.
Moving Emails to IMAP Folders#
After classification, the agent moves each email into a matching folder. IMAP calls these “mailboxes.” You need to create them first if they don’t exist.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
| FOLDER_MAP = {
"urgent": "Triage/Urgent",
"action-required": "Triage/ActionRequired",
"normal": "Triage/Normal",
"spam": "Triage/Spam",
}
def ensure_folders_exist(mail):
for folder in FOLDER_MAP.values():
mail.create(folder) # Silently fails if folder already exists
def move_email(mail, msg_id, category):
folder = FOLDER_MAP.get(category, "Triage/Normal")
result = mail.copy(msg_id, folder)
if result[0] == "OK":
mail.store(msg_id, "+FLAGS", "\\Deleted")
mail.expunge()
def run_triage():
mail = connect_to_inbox()
ensure_folders_exist(mail)
emails = fetch_unread_emails(mail)
print(f"Found {len(emails)} unread emails")
for email_data in emails:
classification = classify_email(email_data)
draft = draft_response(email_data, classification)
print(f"\n--- {email_data['subject']} ---")
print(f" From: {email_data['sender']}")
print(f" Category: {classification['category']} | Priority: {classification['priority']}")
print(f" Reason: {classification['reason']}")
if draft:
print(f" Draft reply: {draft[:120]}...")
move_email(mail, email_data["id"], classification["category"])
print(f" Moved to: {FOLDER_MAP.get(classification['category'], 'Triage/Normal')}")
mail.logout()
print(f"\nTriage complete. Processed {len(emails)} emails.")
if __name__ == "__main__":
run_triage()
|
The move_email function copies the message to the target folder, flags the original as deleted, then expunges it. This is how IMAP “moves” work — there’s no native move command in the base protocol (though some servers support the MOVE extension).
Run the full agent:
1
| OPENAI_API_KEY=sk-your-key python email_triage.py
|
Common Errors and Fixes#
imaplib.IMAP4.error: b'[AUTHENTICATIONFAILED] — Your credentials are wrong or IMAP access is disabled. For Gmail, you must enable “Less secure app access” or (better) create an App Password under your Google Account security settings. Two-factor auth requires app passwords.
imaplib.IMAP4.error: b'[ALERT] Please log in via your web browser' — Google is blocking the login. Visit https://accounts.google.com/DisplayUnlockCaptcha, unlock it, then retry within a few minutes.
json.JSONDecodeError on classification — The LLM occasionally wraps JSON in markdown code fences. Strip them before parsing:
1
2
3
4
5
| raw = raw.strip()
if raw.startswith("```"):
raw = raw.split("\n", 1)[1] # Remove opening fence
raw = raw.rsplit("```", 1)[0] # Remove closing fence
result = json.loads(raw.strip())
|
Emails stuck in INBOX after move — Some IMAP servers require selecting the INBOX again after expunge. Add mail.select("INBOX") after mail.expunge() if messages aren’t disappearing.
UnicodeDecodeError on email body — Not all emails declare their charset correctly. The errors="replace" parameter in our decode() calls handles this, but if you’re seeing garbled text, try detecting the encoding with chardet:
1
2
3
4
5
6
| import chardet
def safe_decode(payload):
detected = chardet.detect(payload)
encoding = detected.get("encoding") or "utf-8"
return payload.decode(encoding, errors="replace")
|
Rate limits on OpenAI API — If you’re processing hundreds of emails, add a delay between calls or batch them. gpt-4o-mini has generous rate limits, but a time.sleep(0.5) between classifications prevents 429 errors.