AI Ethics and Bias
Build fair and ethical AI systems.
Build responsible AI systems.
What is AI Bias?
When AI makes unfair decisions based on protected attributes.
Problem: AI learns from biased data!
Types of Bias
Data bias: Training data not representative
Algorithm bias: Model design favors certain groups
Evaluation bias: Test data doesn't reflect real world
Deployment bias: AI used inappropriately
Detecting Bias
# Install
# pip install fairlearn
from fairlearn.metrics import MetricFrame
from sklearn.metrics import accuracy_score
import pandas as pd
# Assume you have predictions and sensitive features
y_true = [0, 1, 1, 0, 1, 0]
y_pred = [0, 1, 0, 0, 1, 1]
gender = ['M', 'F', 'F', 'M', 'F', 'M']
# Calculate metrics by group
metric_frame = MetricFrame(
metrics=accuracy_score,
y_true=y_true,
y_pred=y_pred,
sensitive_features=gender
)
print("Overall accuracy:", metric_frame.overall)
print("By group:")
print(metric_frame.by_group)
Fairness Metrics
from fairlearn.metrics import (
demographic_parity_difference,
equalized_odds_difference
)
# Demographic parity
dp = demographic_parity_difference(
y_true, y_pred,
sensitive_features=gender
)
print(f"Demographic parity difference: {dp:.3f}")
# Equalized odds
eo = equalized_odds_difference(
y_true, y_pred,
sensitive_features=gender
)
print(f"Equalized odds difference: {eo:.3f}")
Mitigating Bias
1. Better Data Collection
# Check data distribution
df = pd.DataFrame({'gender': gender, 'outcome': y_true})
print(df.groupby('gender')['outcome'].value_counts())
# Balance dataset
from sklearn.utils import resample
# Oversample minority group
minority = df[df['gender'] == 'F']
majority = df[df['gender'] == 'M']
minority_upsampled = resample(
minority,
replace=True,
n_samples=len(majority),
random_state=42
)
balanced_df = pd.concat([majority, minority_upsampled])
2. Fair Model Training
from fairlearn.reductions import ExponentiatedGradient, DemographicParity
from sklearn.linear_model import LogisticRegression
# Fair classifier
model = LogisticRegression()
constraint = DemographicParity()
mitigator = ExponentiatedGradient(model, constraint)
mitigator.fit(X_train, y_train, sensitive_features=sensitive_train)
# Predict
y_pred = mitigator.predict(X_test)
3. Post-processing
from fairlearn.postprocessing import ThresholdOptimizer
# Optimize thresholds per group
optimizer = ThresholdOptimizer(
estimator=model,
constraints="demographic_parity"
)
optimizer.fit(X_train, y_train, sensitive_features=sensitive_train)
y_pred_fair = optimizer.predict(X_test, sensitive_features=sensitive_test)
Explainability
Make AI decisions transparent:
# Install
# pip install shap
import shap
# Explain predictions
explainer = shap.Explainer(model, X_train)
shap_values = explainer(X_test)
# Visualize
shap.plots.waterfall(shap_values[0])
shap.plots.beeswarm(shap_values)
Ethical Guidelines
Transparency: Explain how AI makes decisions
Fairness: Treat all groups equally
Privacy: Protect user data
Accountability: Take responsibility for AI actions
Safety: Ensure AI doesn't cause harm
Fairness Trade-offs
Accuracy vs Fairness: More fair model may be less accurate
Group fairness vs Individual fairness: Can't always satisfy both
Different fairness definitions: May conflict with each other
Best Practices
- Diverse teams: Include different perspectives
- Regular audits: Check for bias continuously
- User feedback: Listen to affected communities
- Documentation: Record decisions and trade-offs
- Testing: Test on diverse populations
Real-world Examples
Hiring: Don't discriminate by gender/race
Lending: Fair loan approval
Criminal justice: Unbiased risk assessment
Healthcare: Equal treatment recommendations
Remember
- All AI systems have bias - minimize it!
- Test fairness on multiple groups
- Balance fairness with accuracy
- Be transparent about limitations
- Ethics is ongoing, not one-time