Your model has great accuracy, but it performs worse for one demographic group than another. That’s a fairness problem, and accuracy alone won’t catch it. Fairlearn gives you the tools to measure exactly where bias shows up and algorithms to reduce it.

Install fairlearn alongside scikit-learn:

1
pip install fairlearn scikit-learn pandas

Load Data and Train a Baseline

We’ll use the Adult Census dataset, a classic benchmark for fairness research. It predicts whether someone earns over $50K/year, and the sex column is the sensitive feature we want to audit.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from fairlearn.metrics import MetricFrame, demographic_parity_difference, equalized_odds_difference
from sklearn.metrics import accuracy_score, balanced_accuracy_score

# Load Adult Census data
from sklearn.datasets import fetch_openml
data = fetch_openml(data_id=1590, as_frame=True)
X = data.data
y = (data.target == ">50K").astype(int)

# The sensitive feature
sensitive = X["sex"]

# Drop non-numeric columns for simplicity
X_numeric = X.select_dtypes(include=["number"]).copy()

X_train, X_test, y_train, y_test, sens_train, sens_test = train_test_split(
    X_numeric, y, sensitive, test_size=0.3, random_state=42
)

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Train a baseline logistic regression
baseline = LogisticRegression(max_iter=1000, random_state=42)
baseline.fit(X_train_scaled, y_train)
y_pred_baseline = baseline.predict(X_test_scaled)

print(f"Baseline accuracy: {accuracy_score(y_test, y_pred_baseline):.3f}")

This gives you a working model. Now the question: is it fair?

Measure Fairness with MetricFrame

MetricFrame breaks down any sklearn metric by sensitive group. You’ll immediately see if your model treats groups differently.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
metrics = {
    "accuracy": accuracy_score,
    "balanced_accuracy": balanced_accuracy_score,
}

mf = MetricFrame(
    metrics=metrics,
    y_true=y_test,
    y_pred=y_pred_baseline,
    sensitive_features=sens_test,
)

print("=== Per-group metrics ===")
print(mf.by_group)
print()
print("=== Differences (max - min across groups) ===")
print(mf.difference())
print()

dp_diff = demographic_parity_difference(
    y_test, y_pred_baseline, sensitive_features=sens_test
)
eo_diff = equalized_odds_difference(
    y_test, y_pred_baseline, sensitive_features=sens_test
)
print(f"Demographic parity difference: {dp_diff:.3f}")
print(f"Equalized odds difference:     {eo_diff:.3f}")

What the numbers mean:

  • Demographic parity difference measures the gap in positive prediction rates between groups. A value of 0 means both groups get positive predictions at the same rate. On this dataset, you’ll typically see values around 0.15-0.20, meaning one group is predicted to earn >$50K much more often.
  • Equalized odds difference measures the gap in true positive and false positive rates. It tells you whether the model’s errors are distributed unevenly.

Any value above 0.05-0.10 deserves attention.

Mitigate Bias with ExponentiatedGradient

Fairlearn’s ExponentiatedGradient retrains your model while enforcing a fairness constraint. It’s a reduction-based approach: it converts the fairness problem into a sequence of cost-sensitive classification problems.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
from fairlearn.reductions import ExponentiatedGradient, DemographicParity

constraint = DemographicParity()
mitigator = ExponentiatedGradient(
    estimator=LogisticRegression(max_iter=1000, random_state=42),
    constraints=constraint,
)

mitigator.fit(X_train_scaled, y_train, sensitive_features=sens_train)
y_pred_mitigated = mitigator.predict(X_test_scaled)

# Compare before and after
dp_after = demographic_parity_difference(
    y_test, y_pred_mitigated, sensitive_features=sens_test
)
eo_after = equalized_odds_difference(
    y_test, y_pred_mitigated, sensitive_features=sens_test
)
acc_after = accuracy_score(y_test, y_pred_mitigated)

print("=== Before Mitigation ===")
print(f"Accuracy:                   {accuracy_score(y_test, y_pred_baseline):.3f}")
print(f"Demographic parity diff:    {dp_diff:.3f}")
print(f"Equalized odds diff:        {eo_diff:.3f}")
print()
print("=== After Mitigation (ExponentiatedGradient) ===")
print(f"Accuracy:                   {acc_after:.3f}")
print(f"Demographic parity diff:    {dp_after:.3f}")
print(f"Equalized odds diff:        {eo_after:.3f}")

You’ll typically see the demographic parity difference drop from ~0.18 to under 0.02, at the cost of a few percentage points of accuracy. That’s the fairness-accuracy tradeoff, and it’s usually worth it.

Post-Processing with ThresholdOptimizer

If you can’t retrain your model (maybe it’s already deployed, or training is expensive), ThresholdOptimizer adjusts prediction thresholds per group to equalize outcomes after the fact.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
from fairlearn.postprocessing import ThresholdOptimizer

postprocessor = ThresholdOptimizer(
    estimator=baseline,
    constraints="demographic_parity",
    objective="balanced_accuracy_score",
    prefit=True,
)

postprocessor.fit(X_test_scaled, y_test, sensitive_features=sens_test)
y_pred_post = postprocessor.predict(X_test_scaled, sensitive_features=sens_test)

dp_post = demographic_parity_difference(
    y_test, y_pred_post, sensitive_features=sens_test
)
print(f"ThresholdOptimizer demographic parity diff: {dp_post:.3f}")
print(f"ThresholdOptimizer accuracy:                {accuracy_score(y_test, y_pred_post):.3f}")

ThresholdOptimizer works on any classifier that outputs probabilities. Set prefit=True when passing an already-trained model.

Build a Full Pipeline

Here’s a compact end-to-end script that loads data, trains, audits, mitigates, and compares:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
import pandas as pd
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score
from fairlearn.metrics import MetricFrame, demographic_parity_difference
from fairlearn.reductions import ExponentiatedGradient, EqualizedOdds

# Data
data = fetch_openml(data_id=1590, as_frame=True)
X = data.data.select_dtypes(include=["number"]).copy()
y = (data.target == ">50K").astype(int)
sensitive = data.data["sex"]

X_train, X_test, y_train, y_test, s_train, s_test = train_test_split(
    X, y, sensitive, test_size=0.3, random_state=42
)

scaler = StandardScaler()
X_train_s = scaler.fit_transform(X_train)
X_test_s = scaler.transform(X_test)

# Baseline
gbc = GradientBoostingClassifier(n_estimators=100, random_state=42)
gbc.fit(X_train_s, y_train)
y_base = gbc.predict(X_test_s)

# Mitigated (equalized odds this time)
mitigator = ExponentiatedGradient(
    estimator=GradientBoostingClassifier(n_estimators=100, random_state=42),
    constraints=EqualizedOdds(),
)
mitigator.fit(X_train_s, y_train, sensitive_features=s_train)
y_fair = mitigator.predict(X_test_s)

# Report
for label, preds in [("Baseline", y_base), ("Mitigated", y_fair)]:
    mf = MetricFrame(
        metrics={"accuracy": accuracy_score},
        y_true=y_test,
        y_pred=preds,
        sensitive_features=s_test,
    )
    dp = demographic_parity_difference(y_test, preds, sensitive_features=s_test)
    print(f"{label}:")
    print(f"  Overall accuracy:          {mf.overall['accuracy']:.3f}")
    print(f"  Per-group accuracy:        {dict(mf.by_group['accuracy'])}")
    print(f"  Demographic parity diff:   {dp:.3f}")
    print()

This gives you a clear before/after comparison. EqualizedOdds as the constraint forces the model to equalize both true positive and false positive rates across groups — a stricter requirement than demographic parity.

Common Errors and Fixes

ValueError: sensitive_features has X samples, but y has Y samples

The sensitive feature array and your labels must have the same length and align row-by-row. This usually happens when you forget to split the sensitive features along with X and y:

1
2
3
4
5
6
7
8
9
# Wrong: using the full sensitive array with test labels
y_pred = model.predict(X_test)
dp = demographic_parity_difference(y_test, y_pred, sensitive_features=sensitive)  # mismatched lengths

# Right: split sensitive features alongside X and y
X_train, X_test, y_train, y_test, s_train, s_test = train_test_split(
    X, y, sensitive, test_size=0.3, random_state=42
)
dp = demographic_parity_difference(y_test, y_pred, sensitive_features=s_test)

UserWarning: No data for group ... from MetricFrame

Your sensitive feature column contains NaN values. Drop or impute them before splitting:

1
2
3
4
mask = sensitive.notna()
X = X[mask]
y = y[mask]
sensitive = sensitive[mask]

ThresholdOptimizer raises NotFittedError with prefit=True

You passed an unfitted estimator but told Fairlearn it’s already fitted. Either train first or remove the flag:

1
2
3
4
5
6
7
# Option 1: train first, then pass prefit=True
model.fit(X_train, y_train)
opt = ThresholdOptimizer(estimator=model, constraints="demographic_parity", prefit=True)

# Option 2: let ThresholdOptimizer train the model itself
opt = ThresholdOptimizer(estimator=model, constraints="demographic_parity", prefit=False)
opt.fit(X_train, y_train, sensitive_features=s_train)

ExponentiatedGradient converges slowly or times out

The default runs for 50 iterations. For large datasets or complex models, increase max_iter or use a simpler base estimator:

1
2
3
4
5
6
mitigator = ExponentiatedGradient(
    estimator=LogisticRegression(max_iter=1000),
    constraints=DemographicParity(),
    max_iter=100,  # more iterations for convergence
    eps=0.01,      # relax the constraint slightly
)

Reducing eps tightens the fairness constraint but makes convergence harder. Start with eps=0.01 and decrease only if you need tighter guarantees.