Fairness Score
{{ "%.4f"|format(overall_fairness_score|default(0)) }}
Protected Attributes
{{ total_attributes }}
Warnings
{{ total_warnings }}
Critical Issues
{{ total_critical }}
Assessment
{{ assessment }}
π Fairness Assessment Overview
Overall fairness assessment across all protected attributes.
π Fairness Metrics Comparison
Comparison of all fairness metrics across protected attributes.
π― Fairness Radar
Multi-dimensional fairness profile for each protected attribute.
π’ Confusion Matrices by Group
π¬ Pre-Training Fairness Analysis
Analysis of fairness in the training data BEFORE model training.
These metrics are exclusive to DeepBridge and help identify bias in the data itself.
π All Pre-Training Metrics
BCL (Class Balance), BCO (Concept Balance), KL Divergence, JS Divergence
π‘ What are Pre-Training Metrics?
- BCL (Class Balance): Measures if groups have similar sample sizes
- BCO (Concept Balance): Measures if groups have similar positive class rates
- KL Divergence: Asymmetric measure of distribution difference
- JS Divergence: Symmetric, bounded measure of distribution difference (0-1)
π₯ Group Size Distribution
Sample balance across demographic groups - identifies underrepresented groups.
βοΈ Concept Balance
Positive class rate comparison - detects outcome imbalance in training data.
βοΈ Post-Training Detailed Analysis
Advanced fairness metrics after model training, including EEOC compliance monitoring.
π― Disparate Impact - EEOC 80% Rule
CRITICAL LEGAL METRIC: Shows compliance with EEOC 80% rule (4/5ths rule).
βοΈ EEOC 80% Rule:
The selection rate for any protected group should be at least 80% of the rate for the highest group.
Ratios below 0.8 may indicate adverse impact and potential legal issues.
- π’ β₯0.8: COMPLIANT - Passes EEOC test
- π‘ 0.7-0.8: WARNING - Borderline compliance
- π΄ <0.7: CRITICAL - High legal risk
π Statistical Parity - Disparity Analysis
Shows how far each attribute deviates from perfect fairness (0.0 = perfect).
π¦ Compliance Status Matrix
Executive dashboard showing compliance status across all main metrics.
Legend:
β Pass
β Warning
β Critical
π Complementary Fairness Metrics
Additional fairness metrics including exclusive DeepBridge metrics:
Treatment Equality and Entropy Index.
π― Precision & Accuracy by Group
Performance metrics comparison across demographic groups.
βοΈ Treatment Equality Analysis
EXCLUSIVE DeepBridge metric: Shows if errors (FN vs FP) are balanced across groups.
π‘ What is Treatment Equality?
Treatment Equality measures the ratio of False Negatives to False Positives.
Groups should have similar error ratios. Points on the diagonal line indicate perfect balance.
π― Complementary Metrics Radar
Multi-dimensional view of 6 complementary fairness metrics.
Metrics Included:
- Conditional Acceptance: PPV (Positive Predictive Value) parity
- Conditional Rejection: NPV (Negative Predictive Value) parity
- Precision Difference: Precision gap between groups
- Accuracy Difference: Overall accuracy gap
- Treatment Equality: FN/FP ratio balance (exclusive)
- Entropy Index: Individual fairness via generalized entropy (exclusive)
π Data Distribution Analysis
Visualization of data distributions for protected attributes and target variable.
π₯ Protected Attributes Distribution
Sample distribution across demographic groups - identifies representation issues.
β οΈ Minimum Representation Threshold:
Groups with less than 2% representation are typically excluded from fairness analysis
due to statistical instability (EEOC "Flip-Flop Rule").
π― Target Variable Distribution
Distribution of outcomes (classes) in the dataset.
Threshold Analysis
Impact of decision thresholds on fairness metrics.
{% endif %}
{% if warnings|length > 0 or critical_issues|length > 0 %}
{% if critical_issues|length > 0 %}
Critical Issues
| Issue |
Severity |
Description |
{% endif %}
{% if warnings|length > 0 %}
Warnings
| Warning |
Severity |
Description |
{% endif %}
π Understanding Fairness Testing
This section provides comprehensive information about fairness metrics, legal frameworks,
and interpretation guidelines to help you understand and act on the results presented in this report.
βοΈ Legal Framework
EEOC 80% Rule (Four-Fifths Rule)
The Equal Employment Opportunity Commission (EEOC) established the 80% rule
as a practical measure to identify adverse impact in employment decisions. Under this rule:
- Definition: The selection rate for any protected group should be at least 80% (4/5ths)
of the selection rate for the group with the highest selection rate.
- Legal Basis: Uniform Guidelines on Employee Selection Procedures (1978)
- Application: If the ratio falls below 0.8, it may indicate adverse impact
and potential violation of civil rights laws (Title VII of the Civil Rights Act of 1964).
- Enforcement: Used by EEOC, Department of Labor (DOL), and courts to assess
discrimination in hiring, promotions, and other employment decisions.
Other Legal References
- Title VII (Civil Rights Act, 1964): Prohibits employment discrimination based on
race, color, religion, sex, or national origin.
- Age Discrimination in Employment Act (ADEA, 1967): Protects workers 40 years and older
from age-based discrimination.
- Equal Credit Opportunity Act (ECOA, 1974): Prohibits discrimination in credit decisions.
- Fair Housing Act (1968): Prohibits discrimination in housing-related decisions.
β οΈ Legal Disclaimer
This report is intended for informational and technical purposes only. It does not constitute
legal advice. For legal compliance questions, consult with qualified legal counsel specializing
in employment law and civil rights.
π Fairness Metric Categories
1. Pre-Training Metrics (Data Fairness)
These metrics assess fairness in the training data before model training:
- Class Balance (BCL): Measures whether protected groups have similar sample sizes.
Large imbalances can lead to biased models that perform poorly on underrepresented groups.
- Concept Balance (BCO): Checks if protected groups have similar positive outcome rates
in the training data. Imbalance here can cause the model to learn biased patterns.
- KL Divergence: Asymmetric measure (0 to β) of how much one probability distribution
differs from another. Higher values indicate greater distributional differences.
- JS Divergence: Symmetric, bounded version of KL divergence (0 to 1). Values closer
to 1 indicate more dissimilar distributions between groups.
2. Post-Training Metrics (Model Fairness)
These metrics evaluate fairness in model predictions after training:
Group Fairness Metrics:
- Statistical Parity (Demographic Parity): Requires equal positive prediction rates
across groups. Measured as the difference between group rates (0 = perfect parity).
- Disparate Impact: Ratio of selection rates between groups (0 to β). Values below 0.8
violate the EEOC 80% rule. Value of 1.0 = perfect equality.
- Equal Opportunity: Requires equal True Positive Rates (recall) across groups.
Ensures qualified individuals have equal chances regardless of protected attributes.
- Equalized Odds: Requires equal TPR and FPR across groups. Stricter than equal opportunity,
ensuring fairness for both positive and negative outcomes.
Predictive Parity Metrics:
- Positive Predictive Value (PPV) Parity: Precision should be equal across groups.
Ensures that positive predictions are equally reliable for all groups.
- Negative Predictive Value (NPV) Parity: Negative predictions should be equally
reliable across groups.
- False Positive Rate Difference: Measures disparity in false alarm rates. Important
for avoiding unfair false accusations.
- False Negative Rate Difference: Measures disparity in missed opportunities. Important
for ensuring qualified candidates aren't overlooked.
Performance Disparity Metrics:
- Accuracy Difference: Measures overall prediction accuracy disparity across groups.
- Precision Difference: Measures how precision varies across protected groups.
3. Complementary Metrics (DeepBridge Exclusive)
Advanced metrics for nuanced fairness assessment:
- Treatment Equality: Compares the ratio of False Negatives to False Positives across
groups. Ensures errors are distributed equally, not just overall accuracy.
- Entropy Index: Measures individual-level fairness using generalized entropy.
Captures within-group disparities that group-level metrics might miss.
π How to Interpret Results
Thresholds and Severity Levels
Severity Classification:
- β Pass: Metric values indicate fair treatment across groups.
- β Warning: Moderate disparities detected. Monitor and investigate further.
- β Critical: Significant disparities that likely violate
fairness standards and may have legal implications.
Common Thresholds
| Metric |
Pass (β) |
Warning (β ) |
Critical (β) |
| Disparate Impact Ratio |
β₯ 0.8 |
0.7 - 0.8 |
< 0.7 |
| Statistical Parity Difference |
|diff| β€ 0.1 |
0.1 < |diff| β€ 0.2 |
|diff| > 0.2 |
| Equal Opportunity Difference |
|diff| β€ 0.1 |
0.1 < |diff| β€ 0.15 |
|diff| > 0.15 |
| Accuracy Difference |
|diff| β€ 0.05 |
0.05 < |diff| β€ 0.1 |
|diff| > 0.1 |
Interpretation Steps
- Review Overall Assessment: Check the Fairness Score and overall assessment at the top
of the report.
- Identify Critical Issues: Navigate to the Issues tab to see violations requiring
immediate attention.
- Analyze Disparate Impact: This is your primary legal compliance indicator.
Values below 0.8 require investigation and potential corrective action.
- Examine Group-Specific Metrics: Look at each protected attribute (race, gender, age)
separately to identify which groups are affected.
- Consider Context: Some metrics may conflict (e.g., demographic parity vs. equal opportunity).
Choose metrics aligned with your use case and ethical priorities.
- Review Threshold Analysis: If available, examine how different decision thresholds
affect fairness vs. performance trade-offs.
π‘ Recommendations for Addressing Bias
Pre-Training Interventions (Data-Level)
- Re-sampling: Oversample minority groups or undersample majority groups to balance representation.
- Re-weighting: Assign higher weights to underrepresented groups during training.
- Data Augmentation: Generate synthetic samples for minority groups using techniques like SMOTE.
- Feature Engineering: Remove or transform features that encode protected attributes
or their proxies.
In-Training Interventions (Algorithm-Level)
- Adversarial Debiasing: Train models to make predictions that an adversary cannot
use to predict protected attributes.
- Fairness Constraints: Add fairness metrics as constraints or regularization terms
in the loss function.
- Prejudice Remover: Add a regularization term that penalizes models for learning
associations with protected attributes.
Post-Training Interventions (Outcome-Level)
- Threshold Optimization: Use different decision thresholds for different groups
to equalize outcomes (use with caution - may not be legal in all contexts).
- Calibration: Adjust prediction scores to ensure equal calibration across groups.
- Reject Option Classification: Defer uncertain predictions to human review,
especially for minority groups.
β οΈ Important Considerations:
- Some interventions may reduce overall model performance. Document trade-offs carefully.
- Group-specific thresholds may be illegal in certain jurisdictions and use cases.
- Always consult legal counsel before implementing fairness interventions in production systems.
- Document all decisions and trade-offs for audit and compliance purposes.
π Glossary
Key Terms
- Protected Attribute: A characteristic protected by law from discrimination
(e.g., race, gender, age).
- Adverse Impact: A substantially different rate of selection that works to the
disadvantage of a protected group.
- Selection Rate: The proportion of applicants or candidates selected for a positive outcome.
- True Positive Rate (TPR): Proportion of actual positives correctly identified
(also called Recall or Sensitivity).
- False Positive Rate (FPR): Proportion of actual negatives incorrectly identified
as positive.
- False Negative Rate (FNR): Proportion of actual positives incorrectly identified
as negative.
- Positive Predictive Value (PPV): Proportion of positive predictions that are correct
(also called Precision).
- Negative Predictive Value (NPV): Proportion of negative predictions that are correct.
- Group Fairness: Fairness defined at the group level - requires similar outcomes
for different demographic groups.
- Individual Fairness: Fairness defined at the individual level - requires similar
individuals to receive similar outcomes.
Acronyms
- EEOC: Equal Employment Opportunity Commission
- ADEA: Age Discrimination in Employment Act
- ECOA: Equal Credit Opportunity Act
- BCL: Class Balance (Balance Class Label)
- BCO: Concept Balance (Balance Concept)
- KL: Kullback-Leibler (divergence)
- JS: Jensen-Shannon (divergence)
- TPR: True Positive Rate
- FPR: False Positive Rate
- FNR: False Negative Rate
- PPV: Positive Predictive Value
- NPV: Negative Predictive Value
π References & Further Reading
Academic Papers
- Barocas, S., Hardt, M., & Narayanan, A. (2019). Fairness and Machine Learning.
fairmlbook.org
- Hardt, M., Price, E., & Srebro, N. (2016). "Equality of Opportunity in Supervised Learning."
NIPS 2016.
- Chouldechova, A. (2017). "Fair prediction with disparate impact: A study of bias in recidivism
prediction instruments." Big Data, 5(2).
- Feldman, M., et al. (2015). "Certifying and removing disparate impact." KDD 2015.
Legal & Regulatory Resources
- EEOC Uniform Guidelines on Employee Selection Procedures (1978)
- Title VII of the Civil Rights Act of 1964
- Age Discrimination in Employment Act (ADEA) of 1967
- Equal Credit Opportunity Act (ECOA) of 1974
- EU General Data Protection Regulation (GDPR) - Articles on Automated Decision-Making
Technical Resources
- AI Fairness 360 (IBM) - Open source toolkit: aif360.mybluemix.net
- Fairlearn (Microsoft) - Open source toolkit: fairlearn.org
- Google's What-If Tool - Model fairness visualization
- DeepBridge Documentation - Advanced fairness testing features