{% if favicon %} {% endif %}
Performance Analysis Report
| Type: | {{ model_type|default('Unknown') }} |
| Features: | {{ features|length|default(0) }} |
| Primary Metric: | {{ metric|default('Score')|upper }} |
| Critical Features: | {{ feature_subset|length|default(0) }} |
| Alternative Models: | {{ report_data.alternative_models|length|default(0) }} |
| Perturbation Levels: | {% if report_data and report_data.raw and report_data.raw.by_level %}{{ report_data.raw.by_level|length }}{% else %}0{% endif %} |
| Iterations Per Level: | {{ iterations|default(10) }} |
| Max Impact Level: | {% set max_level = {'level': '0', 'impact': 0} %} {% if report_data and report_data.raw and report_data.raw.by_level %} {% for level, level_data in report_data.raw.by_level.items() %} {% if level_data.overall_result and level_data.overall_result.all_features and level_data.overall_result.all_features.impact and level_data.overall_result.all_features.impact > max_level.impact %} {% set _ = max_level.update({'level': level, 'impact': level_data.overall_result.all_features.impact}) %} {% endif %} {% endfor %} {% endif %} {{ max_level.impact|default(0)|safe_multiply(100)|safe_round(2) }}% at {{ max_level.level|default('N/A') }} |
| Feature Subset Impact: | {% if report_data and report_data.feature_subset_max_impact and report_data.feature_subset_max_impact.value > 0 %} {{ report_data.feature_subset_max_impact.value|default(0)|safe_multiply(100)|safe_round(2) }}% at {{ report_data.feature_subset_max_impact.level|default('N/A') }} {% else %} {% set max_subset_impact = {'level': '0', 'impact': 0} %} {% if report_data and report_data.raw and report_data.raw.by_level %} {% for level, level_data in report_data.raw.by_level.items() %} {% if level_data.overall_result and level_data.overall_result.feature_subset and level_data.overall_result.feature_subset.impact and level_data.overall_result.feature_subset.impact > max_subset_impact.impact %} {% set _ = max_subset_impact.update({'level': level, 'impact': level_data.overall_result.feature_subset.impact}) %} {% endif %} {% endfor %} {% endif %} {% if max_subset_impact.impact > 0 %} {{ max_subset_impact.impact|default(0)|safe_multiply(100)|safe_round(2) }}% at {{ max_subset_impact.level|default('N/A') }} {% else %} 0.00% {% endif %} {% endif %} |
| Generation Time | {{ timestamp }} |
|---|---|
| Feature Subset | {{ feature_subset_display }} |
| Metric | {{ metric }} |
| Report Type | Static (non-interactive) |
| Model | Base {{ metric|capitalize }} | Robustness Score | Avg. Impact | {% if report_data.metrics_details %} {% for metric_name in report_data.metrics_details|sort %}{{ metric_name|title }} | {% endfor %} {% else %} {% set primary_metrics = report_data.metrics %} {% if primary_metrics %} {% for metric_name, metric_value in primary_metrics.items() %} {% if metric_name != "base_score" and metric_name != "robustness_score" %}{{ metric_name|title }} | {% endif %} {% endfor %} {% endif %} {% endif %}
|---|---|---|---|---|---|
| {{ model_name }} | {{ "%.4f"|format(base_score) }} | {{ "%.4f"|format(robustness_score) }} | {{ "%.4f"|format(raw_impact) }} | {% if report_data.metrics_details %} {% for metric_name in report_data.metrics_details|sort %}{{ "%.4f"|format(report_data.metrics_details[metric_name]|default(0)) }} | {% endfor %} {% else %} {% set primary_metrics = report_data.metrics %} {% if primary_metrics %} {% for metric_name, metric_value in primary_metrics.items() %} {% if metric_name != "base_score" and metric_name != "robustness_score" %}{{ "%.4f"|format(metric_value|default(0)) }} | {% endif %} {% endfor %} {% endif %} {% endif %}
| {{ model_name }} | {{ "%.4f"|format(model_data.base_score|default(0)) }} | {{ "%.4f"|format(model_data.get('robustness_score', 1.0 - model_data.get('raw_impact', 0))) }} | {{ "%.4f"|format(model_data.get('raw_impact', 0)) }} | {% if report_data.metrics_details %} {% for metric_name in report_data.metrics_details|sort %}{% if model_data.metrics_details and metric_name in model_data.metrics_details %} {{ "%.4f"|format(model_data.metrics_details[metric_name]|default(0)) }} {% elif model_data.metrics and metric_name in model_data.metrics %} {{ "%.4f"|format(model_data.metrics[metric_name]|default(0)) }} {% else %} - {% endif %} | {% endfor %} {% else %} {% set primary_metrics = report_data.metrics %} {% if primary_metrics %} {% for metric_name, metric_value in primary_metrics.items() %} {% if metric_name != "base_score" and metric_name != "robustness_score" %}{% if model_data.metrics and metric_name in model_data.metrics %} {{ "%.4f"|format(model_data.metrics[metric_name]|default(0)) }} {% else %} - {% endif %} | {% endif %} {% endfor %} {% endif %} {% endif %}
Shows the model's average performance at different perturbation levels. The red dotted line represents the base score without perturbations.
{% if charts.overview_chart %}Comparison between the impact of perturbation on all features versus only on the subset of critical features.
Visualizes the worst-case performance at each perturbation level, showing the most adverse scenarios for the model.
{% if charts.worst_performance_chart %}Compares the performance of different models across perturbation levels, helping to identify which model is more robust.
{% if charts.comparison_chart %}Compares the feature importance declared by the model with the importance based on robustness analysis, highlighting discrepancies.
{% if charts.feature_comparison_chart %}| Feature | Robustness Impact | {% if has_model_feature_importance %}Model Importance | Difference | {% endif %}|
|---|---|---|---|---|
| {{ feature }} | {{ "%.4f"|format(importance) }} | {% if has_model_feature_importance %}{{ "%.4f"|format(model_feature_importance.get(feature, 0)|float) }} | {% if model_feature_importance.get(feature) is not none %} {% set diff = (importance|float - model_feature_importance.get(feature, 0)|float) %}{{ "%.4f"|format((diff|abs_value)) }} | {% else %}N/A | {% endif %} {% endif %}
Shows the impact on model performance when each feature is perturbed individually. Red bars indicate performance degradation (feature is important for robustness), while green bars indicate performance improvement or no impact.
Compares the effectiveness of raw (Gaussian noise) vs quantile-based perturbation methods. The filled area highlights performance differences between methods, helping identify which perturbation approach is more suitable for this model.
Compares the robustness impact when perturbing all features versus only a selected subset. This analysis helps identify whether focusing on specific features provides similar insights with reduced testing time, and shows the relative sensitivity of the selected features.
Provides comprehensive statistical analysis of performance distributions across perturbation levels. Features detailed annotations including mean (μ), standard deviation (σ), sample count (n), and coverage percentages (cov). The gradient colors and statistical overlays offer deep insights into model robustness patterns.
Advanced matrix visualization showing performance distributions across multiple models and perturbation levels. Each cell combines violin plots (density distributions) with boxplots (quartile statistics) to provide comprehensive insights into model behavior under different stress conditions. Statistical annotations include mean (μ), standard deviation (σ), and sample counts (n), while baseline references enable direct performance comparisons across models and perturbation levels.
Shows the complete distribution of performance scores using violin plots (density), boxplots (quartiles), and individual points. The red diamond indicates the base score.
{% if charts.boxplot_chart %}| Perturbation Level | Average {{ metric|capitalize }} | Worst {{ metric|capitalize }} | Impact |
|---|---|---|---|
| {{ level }} | {{ "%.4f"|format(level_data.overall_result.all_features.mean_score) }} | {{ "%.4f"|format(level_data.overall_result.all_features.worst_score) }} | {{ "%.4f"|format(base_score - level_data.overall_result.all_features.mean_score) }} |
| Model | Base {{ metric|capitalize }} | Robustness Score | Average Impact | Model Type |
|---|---|---|---|---|
| {{ model_name }} | {{ "%.4f"|format(base_score) }} | {{ "%.4f"|format(robustness_score) }} | {{ "%.4f"|format(raw_impact) }} | {{ model_type }} |
| {{ model_name }} | {{ "%.4f"|format(model_data.base_score) }} | {{ "%.4f"|format(model_data.get('robustness_score', 1.0 - model_data.get('raw_impact', 0))) }} | {{ "%.4f"|format(model_data.get('raw_impact', 0)) }} | {{ model_data.get('model_type', 'Unknown') }} |