{% if favicon_base64 %} {% endif %}
{% if logo %}
{% endif %}

{{ report_title }}

{{ report_subtitle }}

Fairness Score {{ "%.4f"|format(overall_fairness_score|default(0)) }}
Protected Attributes {{ total_attributes }}
Warnings {{ total_warnings }}
Critical Issues {{ total_critical }}
Assessment {{ assessment }}
{% if has_threshold_analysis %} {% endif %} {% if warnings|length > 0 or critical_issues|length > 0 %} {% endif %}

πŸ“Š Fairness Assessment Overview

Overall fairness assessment across all protected attributes.

πŸ“ˆ Fairness Metrics Comparison

Comparison of all fairness metrics across protected attributes.

🎯 Fairness Radar

Multi-dimensional fairness profile for each protected attribute.

πŸ”’ Confusion Matrices by Group

πŸ”¬ Pre-Training Fairness Analysis

Analysis of fairness in the training data BEFORE model training. These metrics are exclusive to DeepBridge and help identify bias in the data itself.

πŸ“Š All Pre-Training Metrics

BCL (Class Balance), BCO (Concept Balance), KL Divergence, JS Divergence

πŸ’‘ What are Pre-Training Metrics?
  • BCL (Class Balance): Measures if groups have similar sample sizes
  • BCO (Concept Balance): Measures if groups have similar positive class rates
  • KL Divergence: Asymmetric measure of distribution difference
  • JS Divergence: Symmetric, bounded measure of distribution difference (0-1)

πŸ‘₯ Group Size Distribution

Sample balance across demographic groups - identifies underrepresented groups.

βš–οΈ Concept Balance

Positive class rate comparison - detects outcome imbalance in training data.

βš–οΈ Post-Training Detailed Analysis

Advanced fairness metrics after model training, including EEOC compliance monitoring.

🎯 Disparate Impact - EEOC 80% Rule

CRITICAL LEGAL METRIC: Shows compliance with EEOC 80% rule (4/5ths rule).

βš–οΈ EEOC 80% Rule:

The selection rate for any protected group should be at least 80% of the rate for the highest group. Ratios below 0.8 may indicate adverse impact and potential legal issues.

  • 🟒 β‰₯0.8: COMPLIANT - Passes EEOC test
  • 🟑 0.7-0.8: WARNING - Borderline compliance
  • πŸ”΄ <0.7: CRITICAL - High legal risk

πŸ“Š Statistical Parity - Disparity Analysis

Shows how far each attribute deviates from perfect fairness (0.0 = perfect).

🚦 Compliance Status Matrix

Executive dashboard showing compliance status across all main metrics.

Legend: βœ“ Pass ⚠ Warning βœ— Critical

πŸ“ Complementary Fairness Metrics

Additional fairness metrics including exclusive DeepBridge metrics: Treatment Equality and Entropy Index.

🎯 Precision & Accuracy by Group

Performance metrics comparison across demographic groups.

βš–οΈ Treatment Equality Analysis

EXCLUSIVE DeepBridge metric: Shows if errors (FN vs FP) are balanced across groups.

πŸ’‘ What is Treatment Equality?

Treatment Equality measures the ratio of False Negatives to False Positives. Groups should have similar error ratios. Points on the diagonal line indicate perfect balance.

🎯 Complementary Metrics Radar

Multi-dimensional view of 6 complementary fairness metrics.

Metrics Included:
  • Conditional Acceptance: PPV (Positive Predictive Value) parity
  • Conditional Rejection: NPV (Negative Predictive Value) parity
  • Precision Difference: Precision gap between groups
  • Accuracy Difference: Overall accuracy gap
  • Treatment Equality: FN/FP ratio balance (exclusive)
  • Entropy Index: Individual fairness via generalized entropy (exclusive)

πŸ“Š Data Distribution Analysis

Visualization of data distributions for protected attributes and target variable.

πŸ‘₯ Protected Attributes Distribution

Sample distribution across demographic groups - identifies representation issues.

⚠️ Minimum Representation Threshold:

Groups with less than 2% representation are typically excluded from fairness analysis due to statistical instability (EEOC "Flip-Flop Rule").

🎯 Target Variable Distribution

Distribution of outcomes (classes) in the dataset.

{% if has_threshold_analysis %}

Threshold Analysis

Impact of decision thresholds on fairness metrics.

{% endif %} {% if warnings|length > 0 or critical_issues|length > 0 %}
{% if critical_issues|length > 0 %}

Critical Issues

Issue Severity Description
{% endif %} {% if warnings|length > 0 %}

Warnings

Warning Severity Description
{% endif %}
{% endif %}

πŸ“š Understanding Fairness Testing

This section provides comprehensive information about fairness metrics, legal frameworks, and interpretation guidelines to help you understand and act on the results presented in this report.

βš–οΈ Legal Framework

EEOC 80% Rule (Four-Fifths Rule)

The Equal Employment Opportunity Commission (EEOC) established the 80% rule as a practical measure to identify adverse impact in employment decisions. Under this rule:

  • Definition: The selection rate for any protected group should be at least 80% (4/5ths) of the selection rate for the group with the highest selection rate.
  • Legal Basis: Uniform Guidelines on Employee Selection Procedures (1978)
  • Application: If the ratio falls below 0.8, it may indicate adverse impact and potential violation of civil rights laws (Title VII of the Civil Rights Act of 1964).
  • Enforcement: Used by EEOC, Department of Labor (DOL), and courts to assess discrimination in hiring, promotions, and other employment decisions.

Other Legal References

  • Title VII (Civil Rights Act, 1964): Prohibits employment discrimination based on race, color, religion, sex, or national origin.
  • Age Discrimination in Employment Act (ADEA, 1967): Protects workers 40 years and older from age-based discrimination.
  • Equal Credit Opportunity Act (ECOA, 1974): Prohibits discrimination in credit decisions.
  • Fair Housing Act (1968): Prohibits discrimination in housing-related decisions.
⚠️ Legal Disclaimer

This report is intended for informational and technical purposes only. It does not constitute legal advice. For legal compliance questions, consult with qualified legal counsel specializing in employment law and civil rights.

πŸ“Š Fairness Metric Categories

1. Pre-Training Metrics (Data Fairness)

These metrics assess fairness in the training data before model training:

  • Class Balance (BCL): Measures whether protected groups have similar sample sizes. Large imbalances can lead to biased models that perform poorly on underrepresented groups.
  • Concept Balance (BCO): Checks if protected groups have similar positive outcome rates in the training data. Imbalance here can cause the model to learn biased patterns.
  • KL Divergence: Asymmetric measure (0 to ∞) of how much one probability distribution differs from another. Higher values indicate greater distributional differences.
  • JS Divergence: Symmetric, bounded version of KL divergence (0 to 1). Values closer to 1 indicate more dissimilar distributions between groups.

2. Post-Training Metrics (Model Fairness)

These metrics evaluate fairness in model predictions after training:

Group Fairness Metrics:

  • Statistical Parity (Demographic Parity): Requires equal positive prediction rates across groups. Measured as the difference between group rates (0 = perfect parity).
  • Disparate Impact: Ratio of selection rates between groups (0 to ∞). Values below 0.8 violate the EEOC 80% rule. Value of 1.0 = perfect equality.
  • Equal Opportunity: Requires equal True Positive Rates (recall) across groups. Ensures qualified individuals have equal chances regardless of protected attributes.
  • Equalized Odds: Requires equal TPR and FPR across groups. Stricter than equal opportunity, ensuring fairness for both positive and negative outcomes.

Predictive Parity Metrics:

  • Positive Predictive Value (PPV) Parity: Precision should be equal across groups. Ensures that positive predictions are equally reliable for all groups.
  • Negative Predictive Value (NPV) Parity: Negative predictions should be equally reliable across groups.
  • False Positive Rate Difference: Measures disparity in false alarm rates. Important for avoiding unfair false accusations.
  • False Negative Rate Difference: Measures disparity in missed opportunities. Important for ensuring qualified candidates aren't overlooked.

Performance Disparity Metrics:

  • Accuracy Difference: Measures overall prediction accuracy disparity across groups.
  • Precision Difference: Measures how precision varies across protected groups.

3. Complementary Metrics (DeepBridge Exclusive)

Advanced metrics for nuanced fairness assessment:

  • Treatment Equality: Compares the ratio of False Negatives to False Positives across groups. Ensures errors are distributed equally, not just overall accuracy.
  • Entropy Index: Measures individual-level fairness using generalized entropy. Captures within-group disparities that group-level metrics might miss.

πŸ” How to Interpret Results

Thresholds and Severity Levels

Severity Classification:
  • βœ“ Pass: Metric values indicate fair treatment across groups.
  • ⚠ Warning: Moderate disparities detected. Monitor and investigate further.
  • βœ— Critical: Significant disparities that likely violate fairness standards and may have legal implications.

Common Thresholds

Metric Pass (βœ“) Warning (⚠) Critical (βœ—)
Disparate Impact Ratio β‰₯ 0.8 0.7 - 0.8 < 0.7
Statistical Parity Difference |diff| ≀ 0.1 0.1 < |diff| ≀ 0.2 |diff| > 0.2
Equal Opportunity Difference |diff| ≀ 0.1 0.1 < |diff| ≀ 0.15 |diff| > 0.15
Accuracy Difference |diff| ≀ 0.05 0.05 < |diff| ≀ 0.1 |diff| > 0.1

Interpretation Steps

  1. Review Overall Assessment: Check the Fairness Score and overall assessment at the top of the report.
  2. Identify Critical Issues: Navigate to the Issues tab to see violations requiring immediate attention.
  3. Analyze Disparate Impact: This is your primary legal compliance indicator. Values below 0.8 require investigation and potential corrective action.
  4. Examine Group-Specific Metrics: Look at each protected attribute (race, gender, age) separately to identify which groups are affected.
  5. Consider Context: Some metrics may conflict (e.g., demographic parity vs. equal opportunity). Choose metrics aligned with your use case and ethical priorities.
  6. Review Threshold Analysis: If available, examine how different decision thresholds affect fairness vs. performance trade-offs.

πŸ’‘ Recommendations for Addressing Bias

Pre-Training Interventions (Data-Level)

  • Re-sampling: Oversample minority groups or undersample majority groups to balance representation.
  • Re-weighting: Assign higher weights to underrepresented groups during training.
  • Data Augmentation: Generate synthetic samples for minority groups using techniques like SMOTE.
  • Feature Engineering: Remove or transform features that encode protected attributes or their proxies.

In-Training Interventions (Algorithm-Level)

  • Adversarial Debiasing: Train models to make predictions that an adversary cannot use to predict protected attributes.
  • Fairness Constraints: Add fairness metrics as constraints or regularization terms in the loss function.
  • Prejudice Remover: Add a regularization term that penalizes models for learning associations with protected attributes.

Post-Training Interventions (Outcome-Level)

  • Threshold Optimization: Use different decision thresholds for different groups to equalize outcomes (use with caution - may not be legal in all contexts).
  • Calibration: Adjust prediction scores to ensure equal calibration across groups.
  • Reject Option Classification: Defer uncertain predictions to human review, especially for minority groups.
⚠️ Important Considerations:
  • Some interventions may reduce overall model performance. Document trade-offs carefully.
  • Group-specific thresholds may be illegal in certain jurisdictions and use cases.
  • Always consult legal counsel before implementing fairness interventions in production systems.
  • Document all decisions and trade-offs for audit and compliance purposes.

πŸ“– Glossary

Key Terms

  • Protected Attribute: A characteristic protected by law from discrimination (e.g., race, gender, age).
  • Adverse Impact: A substantially different rate of selection that works to the disadvantage of a protected group.
  • Selection Rate: The proportion of applicants or candidates selected for a positive outcome.
  • True Positive Rate (TPR): Proportion of actual positives correctly identified (also called Recall or Sensitivity).
  • False Positive Rate (FPR): Proportion of actual negatives incorrectly identified as positive.
  • False Negative Rate (FNR): Proportion of actual positives incorrectly identified as negative.
  • Positive Predictive Value (PPV): Proportion of positive predictions that are correct (also called Precision).
  • Negative Predictive Value (NPV): Proportion of negative predictions that are correct.
  • Group Fairness: Fairness defined at the group level - requires similar outcomes for different demographic groups.
  • Individual Fairness: Fairness defined at the individual level - requires similar individuals to receive similar outcomes.

Acronyms

  • EEOC: Equal Employment Opportunity Commission
  • ADEA: Age Discrimination in Employment Act
  • ECOA: Equal Credit Opportunity Act
  • BCL: Class Balance (Balance Class Label)
  • BCO: Concept Balance (Balance Concept)
  • KL: Kullback-Leibler (divergence)
  • JS: Jensen-Shannon (divergence)
  • TPR: True Positive Rate
  • FPR: False Positive Rate
  • FNR: False Negative Rate
  • PPV: Positive Predictive Value
  • NPV: Negative Predictive Value

πŸ“š References & Further Reading

Academic Papers

  • Barocas, S., Hardt, M., & Narayanan, A. (2019). Fairness and Machine Learning. fairmlbook.org
  • Hardt, M., Price, E., & Srebro, N. (2016). "Equality of Opportunity in Supervised Learning." NIPS 2016.
  • Chouldechova, A. (2017). "Fair prediction with disparate impact: A study of bias in recidivism prediction instruments." Big Data, 5(2).
  • Feldman, M., et al. (2015). "Certifying and removing disparate impact." KDD 2015.

Legal & Regulatory Resources

  • EEOC Uniform Guidelines on Employee Selection Procedures (1978)
  • Title VII of the Civil Rights Act of 1964
  • Age Discrimination in Employment Act (ADEA) of 1967
  • Equal Credit Opportunity Act (ECOA) of 1974
  • EU General Data Protection Regulation (GDPR) - Articles on Automated Decision-Making

Technical Resources

  • AI Fairness 360 (IBM) - Open source toolkit: aif360.mybluemix.net
  • Fairlearn (Microsoft) - Open source toolkit: fairlearn.org
  • Google's What-If Tool - Model fairness visualization
  • DeepBridge Documentation - Advanced fairness testing features