How to Build an Emotion Detection Pipeline with GoEmotions and Transformers

Sentiment analysis gives you positive, negative, or neutral. That is rarely enough. When users write “I can’t believe they actually shipped this,” you need to know whether that is admiration, anger, or surprise. GoEmotions, Google’s dataset of 58k Reddit comments labeled across 27 emotions plus neutral, solves this. And there are solid fine-tuned models on Hugging Face ready to use right now.

Here is the fastest way to get emotion predictions running:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="SamLowe/roberta-base-go_emotions",
    top_k=None,
)

results = classifier("I'm so proud of what you've accomplished!")
for emotion in results[0]:
    if emotion["score"] > 0.3:
        print(f"{emotion['label']}: {emotion['score']:.3f}")

That loads a RoBERTa model fine-tuned on GoEmotions, runs inference, and filters to emotions above a 0.3 confidence threshold. You get multi-label output because real text carries multiple emotions at once.

The GoEmotions Label Set

GoEmotions defines 27 emotion categories plus a neutral class. These are not arbitrary – they were selected through taxonomic research to cover the range of emotions expressed in online text. The full set:

admiration, amusement, anger, annoyance, approval, caring, confusion, curiosity, desire, disappointment, disapproval, disgust, embarrassment, excitement, fear, gratitude, grief, joy, love, nervousness, optimism, pride, realization, relief, remorse, sadness, surprise, neutral

This granularity matters. A customer support system that can distinguish “disappointment” from “anger” can route tickets differently. A social media monitor that separates “confusion” from “disapproval” gives product teams actionable signal instead of a vague negative score.

The model outputs a score for every label on every input. That is what makes this multi-label classification rather than multi-class – a single sentence can be both “gratitude” and “joy” simultaneously.

Multi-Label Inference with Thresholding

The raw model output gives you scores for all 28 labels. You need a threshold strategy to decide which emotions are “active.” A fixed threshold of 0.3 works well as a starting point, but you can tune per-label thresholds if you have evaluation data.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

model_name = "SamLowe/roberta-base-go_emotions"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

def detect_emotions(text: str, threshold: float = 0.3) -> dict[str, float]:
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
    with torch.no_grad():
        logits = model(**inputs).logits
    probabilities = torch.sigmoid(logits)[0]
    id2label = model.config.id2label
    results = {}
    for idx, prob in enumerate(probabilities):
        score = prob.item()
        if score >= threshold:
            results[id2label[idx]] = round(score, 4)
    return dict(sorted(results.items(), key=lambda x: x[1], reverse=True))

emotions = detect_emotions("This is absolutely terrifying but also kind of exciting")
print(emotions)
# {'excitement': 0.5812, 'fear': 0.4923, 'nervousness': 0.3541, 'surprise': 0.3102}

Notice we use torch.sigmoid instead of softmax. This is a multi-label problem – each label gets an independent probability via sigmoid, not a probability distribution that sums to one. Using softmax here would suppress co-occurring emotions and give you worse results.

Tuning the Threshold

Lower thresholds (0.1-0.2) catch more emotions but increase false positives. Higher thresholds (0.5+) give you only strong signals but miss subtler emotions. For production systems, run a sweep against labeled data:

1
2
3
4
5
6
thresholds = [0.1, 0.2, 0.3, 0.4, 0.5]
test_text = "I'm disappointed but I understand why you made that choice"

for t in thresholds:
    emotions = detect_emotions(test_text, threshold=t)
    print(f"threshold={t}: {list(emotions.keys())}")

At 0.1 you might get six or seven labels. At 0.5 you might get one or two. Pick the threshold that matches your use case’s tolerance for noise.

Batch Processing with the Pipeline API

For processing many texts at once, the Transformers pipeline handles batching efficiently. It automatically pads and batches inputs for GPU throughput.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="SamLowe/roberta-base-go_emotions",
    top_k=None,
    device=0,  # use GPU if available, remove for CPU
    batch_size=32,
)

texts = [
    "Thank you so much for helping me with this!",
    "I can't figure out why this keeps breaking.",
    "This new feature is incredible, great work team!",
    "I'm worried about the deadline but hopeful we can make it.",
    "That was completely uncalled for and unprofessional.",
]

all_results = classifier(texts)

for text, result in zip(texts, all_results):
    top_emotions = [e for e in result if e["score"] > 0.3]
    labels = ", ".join(f"{e['label']} ({e['score']:.2f})" for e in top_emotions[:3])
    print(f"Text: {text[:60]}...")
    print(f"  Emotions: {labels}\n")

Setting top_k=None returns all 28 labels sorted by score. If you only want the top 5, set top_k=5 to save some processing time. The batch_size parameter controls how many inputs get processed together – larger batches use more memory but run faster on GPU.

For very large datasets, use a generator to avoid loading everything into memory:

1
2
3
4
5
6
7
8
9
def text_generator(filepath: str):
    with open(filepath) as f:
        for line in f:
            yield line.strip()

results = classifier(text_generator("input_texts.txt"), batch_size=64)
for result in results:
    top = [e for e in result if e["score"] > 0.3]
    print(top)

Serving the Model via FastAPI

Wrapping this in a FastAPI service takes about 40 lines. Use the lifespan context manager to load the model once at startup and share it across requests.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
from contextlib import asynccontextmanager
from fastapi import FastAPI
from pydantic import BaseModel
from transformers import pipeline

ml_models = {}

@asynccontextmanager
async def lifespan(app: FastAPI):
    ml_models["emotion"] = pipeline(
        "text-classification",
        model="SamLowe/roberta-base-go_emotions",
        top_k=None,
    )
    yield
    ml_models.clear()

app = FastAPI(title="Emotion Detection API", lifespan=lifespan)


class EmotionRequest(BaseModel):
    text: str
    threshold: float = 0.3


class EmotionScore(BaseModel):
    label: str
    score: float


class EmotionResponse(BaseModel):
    text: str
    emotions: list[EmotionScore]


@app.post("/detect", response_model=EmotionResponse)
async def detect_emotions(request: EmotionRequest):
    results = ml_models["emotion"](request.text)
    filtered = [
        EmotionScore(label=e["label"], score=round(e["score"], 4))
        for e in results[0]
        if e["score"] >= request.threshold
    ]
    return EmotionResponse(text=request.text, emotions=filtered)


class BatchRequest(BaseModel):
    texts: list[str]
    threshold: float = 0.3


class BatchResponse(BaseModel):
    results: list[EmotionResponse]


@app.post("/detect/batch", response_model=BatchResponse)
async def detect_emotions_batch(request: BatchRequest):
    all_results = ml_models["emotion"](request.texts)
    responses = []
    for text, result in zip(request.texts, all_results):
        filtered = [
            EmotionScore(label=e["label"], score=round(e["score"], 4))
            for e in result
            if e["score"] >= request.threshold
        ]
        responses.append(EmotionResponse(text=text, emotions=filtered))
    return BatchResponse(results=responses)

Save that as app.py and run it:

1
2
pip install fastapi uvicorn transformers torch
uvicorn app:app --host 0.0.0.0 --port 8000

Test it with curl:

1
2
3
curl -X POST http://localhost:8000/detect \
  -H "Content-Type: application/json" \
  -d '{"text": "I am so grateful for your help!", "threshold": 0.25}'

The batch endpoint handles multiple texts in a single request, which is significantly faster than making individual calls when you have many inputs to process.

Common Errors and Fixes

RuntimeError: CUDA out of memory – The model is about 500MB. If your GPU runs out of memory during batching, reduce batch_size or switch to CPU by removing the device=0 argument. For CPU inference, each request takes roughly 50-100ms which is fine for most API use cases.

The model was not found on Hugging Face Hub – Double check the model ID is exactly SamLowe/roberta-base-go_emotions. Model IDs are case-sensitive. If you are behind a corporate proxy, set HF_HUB_OFFLINE=0 and configure HTTPS_PROXY.

All scores are very low (below 0.1) – This usually means the input text is too short or too different from Reddit-style comments the model was trained on. GoEmotions was built from Reddit data, so very formal or domain-specific text may not classify well. Consider fine-tuning on your own data if this is a persistent issue.

torch.sigmoid vs torch.softmax confusion – If you are getting exactly one emotion per input and the scores sum to 1.0, you accidentally used softmax. This model is multi-label – use sigmoid so each label gets an independent probability.

Slow first request – The model downloads on first use (about 500MB). Subsequent loads use the Hugging Face cache at ~/.cache/huggingface/. In Docker deployments, mount this directory as a volume or bake the model into the image to avoid download delays.

FastAPI TypeError: 'async_generator' object is not callable – Make sure you are using @asynccontextmanager from contextlib and passing the lifespan function (not calling it) to the FastAPI() constructor. The pattern is FastAPI(lifespan=lifespan), not FastAPI(lifespan=lifespan()).

The GoEmotions Label Set#

Multi-Label Inference with Thresholding#

Tuning the Threshold#

Batch Processing with the Pipeline API#

Serving the Model via FastAPI#

Common Errors and Fixes#

Related Guides#

About the Author