๐Ÿ”ฌ batch-detective Report

Generated: {{ timestamp }} UTC  |  Tool v{{ version }}  |  {{ n_samples }} samples

{% if anonymized %}

โš ๏ธ Sample IDs have been anonymized.

{% endif %}
Contents
{% if repeated_measures_covariates %} {% endif %} {% if low_power %} {% endif %} {% if collinearity_warnings %} {% endif %}

Executive Summary

{{ executive_summary | safe }}

Next Steps

{{ next_steps | safe }}

QC Flags

{% if qc_flagged_samples %}

Flagged samples ({{ qc_flagged_samples | length }}):

{% for row in qc_flagged_samples %} {% endfor %}
SampleLibrary OutlierDominant Gene Dom. Gene %Zero Count % Mahal. Outlier
{{ row.sample_id }} {{ 'โš ๏ธ' if row.lib_outlier_flag else 'โœ“' }} {{ row.dominant_gene }} {{ row.dominant_gene_pct | round(1) }}% {{ row.zero_count_pct | round(1) }}% {{ 'โš ๏ธ' if row.mahal_outlier_flag else 'โœ“' }}
{% else %}

โœ… No samples flagged for QC issues.

{% endif %}

Data Overview

{% if covariates_skipped %} {% endif %} {% if condition_on %} {% endif %}
ParameterValue
Samples analyzed{{ n_samples }}
Genes in input{{ n_genes_input }}
Genes analyzed (after filtering){{ n_genes_analyzed }}
Normalization{{ normalization_method }}
n_variable_genes{{ n_variable_genes }}
n_pcs{{ n_pcs }}
min_cpm{{ min_cpm }}
Covariates tested{{ covariates_tested | join(', ') }}
Covariates skipped{{ covariates_skipped | join(', ') }}
Conditioned on{{ condition_on | join(', ') }}

Library Size Distribution

{% if plots.library_sizes %} Library sizes {% endif %}

PCA Scree Plot

{% if plots.scree_plot %} Scree plot {% endif %}

ICC per Covariate (Headline Metric)

ICC values are computed using a one-way random effects model [ICC(1,1)]. This model treats batch labels as randomly sampled from a population of batches. If your batches are fixed (specific dates, specific laboratories), ICC values may slightly underestimate batch effect magnitude.

{% if plots.icc_barplot %} ICC barplot {% endif %} {% if icc_table_rows %} {% for row in icc_table_rows %} {% endfor %}
CovariateLabelGroups Median ICCIQR95% CITier % Genes โ‰ฅ Moderate
{{ row.covariate }} {{ row.label }} {{ row.n_groups }} {% if low_power %}โš ๏ธ{% endif %} {{ row.median_icc | round(3) }} {{ row.iqr_lower | round(3) }} โ€“ {{ row.iqr_upper | round(3) }} {{ row.ci_lower_95 | round(3) }} โ€“ {{ row.ci_upper_95 | round(3) }} {{ row.icc_tier }} {{ (row.prop_genes_moderate_plus * 100) | round(1) }}%
{% else %}

No ICC data computed (no categorical covariates with sufficient samples).

{% endif %}

Interpretation (Koo & Mae, 2016): <0.10 negligible | 0.10โ€“0.30 mild | 0.30โ€“0.60 moderate | >0.60 strong

{% if top_genes_data %}

Top Batch-Associated Genes

{% for cov_name, genes_rows in top_genes_data.items() %}

{{ cov_name }}

These are the genes most strongly associated with {{ cov_name }}. If these are housekeeping genes (GAPDH, ACTB, RPL*, RPS*), the batch effect may primarily affect highly-expressed genes.

{% for row in genes_rows %} {% endfor %}
GeneICCMean Expr (log CPM)Expression RankHousekeeping
{{ row.gene_id }} {{ row.gene_icc | round(3) }} {{ row.mean_expression_log_cpm | round(2) }} {{ row.expression_rank_by_mean }} {{ 'โœ“' if row.is_housekeeping else '' }}
{% if plots['gene_icc_scatter_' + cov_name] is defined %} Gene ICC scatter {% endif %} {% endfor %}
{% endif %}

PC-Metadata Association Heatmap

{% if plots.association_heatmap %} Association heatmap {% endif %}

Association Table

Note: FDR correction is applied within each covariate independently. With {{ n_covariates_tested }} covariates tested, the probability of at least one false positive increases. Interpret borderline results (q=0.03โ€“0.10) in context of effect sizes (ICC, ฮทยฒ). {% if condition_on %} Results condition on: {{ condition_on | join(', ') }}. {% endif %}

{% if assoc_table_rows %} {% for row in assoc_table_rows %} {% endfor %}
CovariateTypePCPC Var% Effect SizeType p-valueq-value (BH) Significant q<0.0595% CI
{{ row.covariate }} {{ row.covariate_type }} PC{{ row.pc }} {{ (row.pc_variance_explained * 100) | round(1) }}% {% if low_power %}โš ๏ธ{% endif %} {{ row.effect_size | round(3) if row.effect_size is not none else 'โ€”' }} {{ row.effect_size_type }} {{ row.pval_raw | round(4) if row.pval_raw is not none else 'โ€”' }} {{ row.pval_adjusted_bh | round(4) if row.pval_adjusted_bh is not none else 'โ€”' }} {{ 'โœ“' if row.significant_q05 else '' }} {{ '[' + (row.ci_lower|round(3)|string) + ', ' + (row.ci_upper|round(3)|string) + ']' if row.ci_lower is not none else 'โ€”' }}
{% else %}

No association results computed.

{% endif %}

PCA Scatter Plots

{% for key, b64 in plots.items() %} {% if key.startswith('pca_') and not key.startswith('pca_') == false %}

{{ key.replace('_', ' ') }}

{{ key }}
{% endif %} {% endfor %}

Sample Distance Heatmap

{% if plots.sample_distance_heatmap %} Distance heatmap {% endif %}

Outlier Report

{% if outlier_rows %}

{{ outlier_rows | length }} sample(s) flagged as potential outliers:

{% for col in outlier_meta_cols %}{% endfor %} {% for row in outlier_rows %} {% for col in outlier_meta_cols %}{% endfor %} {% endfor %}
SampleDetection MethodMahalanobis Distance{{ col }}
{{ row.sample_id }} {{ row.detection_method }} {{ row.mahal_distance | round(2) if row.mahal_distance == row.mahal_distance else 'โ€”' }}{{ row[col] if col in row else 'โ€”' }}
{% else %}

โœ… No outlier samples detected.

{% endif %}

Statistical Methods & Limitations

{{ methods_text | safe }}