Refactor group analysis CLI: replace contrast_column/values with formula-based model + pairwise contrasts#147
Draft
Refactor group analysis CLI: replace contrast_column/values with formula-based model + pairwise contrasts#147
Conversation
6 tasks
…l/pairwise/within
- Remove --contrast_column/--contrast_values CLI args from snakebids.yml
- Add --model (statsmodels formula), --pairwise (factor for pairwise comparisons),
and --within (strata factors) CLI args to snakebids.yml
- Update Snakefile: remove old validation, add pairwise contrast label generation
by reading participants.tsv at planning time; update wildcard constraints;
update all_group_stats and all_group_stats_coloc target rules to use
pairwise_contrast wildcard and remove old contrast-filtered outputs
- Update groupstats.smk: add contrast={pairwise_contrast} entity to
perform_group_stats, create_stats_heatmap, and map_groupstats_to_template_nii
rules; remove concat_subj_parquet_contrast, group_counts_per_voxel_contrast,
group_coloc_counts_per_voxel_contrast, concat_subj_segstats_contrast, and
map_groupavg_segstats_to_template_nii rules
- Rewrite perform_group_stats.py: fit single global OLS model per region/metric
using statsmodels, compute pairwise contrast t-stat/pval/cohensd via model
covariance matrix with marginal mean predictions at specified strata values
- Delete concat_subj_parquet_contrast.py and concat_subj_segstats_contrast.py
Agent-Logs-Url: https://github.com/khanlab/SPIMquant/sessions/5c9691e9-6355-4c0d-85e1-aa07f5c127e3
Co-authored-by: akhanf <11492701+akhanf@users.noreply.github.com>
… log exceptions Agent-Logs-Url: https://github.com/khanlab/SPIMquant/sessions/5c9691e9-6355-4c0d-85e1-aa07f5c127e3 Co-authored-by: akhanf <11492701+akhanf@users.noreply.github.com>
… function Agent-Logs-Url: https://github.com/khanlab/SPIMquant/sessions/5c9691e9-6355-4c0d-85e1-aa07f5c127e3 Co-authored-by: akhanf <11492701+akhanf@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Refactor group analysis CLI to support arbitrary stats models
Refactor group analysis CLI: replace contrast_column/values with formula-based model + pairwise contrasts
Apr 5, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The old group-level CLI was limited to two-group t-tests via
--contrast_column/--contrast_values. This replaces it with a statsmodels formula-based approach supporting arbitrary models, multiple pairwise factors, and stratified contrasts.New CLI
pixi run spimquant /bids /out group \ --model "metric ~ C(treatment) * C(genotype) * C(sex) + age" \ --pairwise treatment \ --within genotype sex \ --cores all--contrast_column treatment--model "metric ~ C(treatment) + age"--contrast_values control drug--pairwise treatment--within genotype sexStatistical approach
--withinstrata fix specific factor values for marginal mean evaluation; model is still fit on all dataOutput naming
Contrast identity is encoded in a
contrastBIDS entity:Pairwise contrast labels are enumerated at Snakemake planning time by reading
participants.tsv, so all outputs are statically deterministic.Key changes
snakebids.yml: Replaced--contrast_column/--contrast_valueswith--model,--pairwise,--withinSnakefile: Readsparticipants.tsvat DAG-planning time to enumerate contrast labels; updatedwildcard_constraintsandall_group_stats/all_group_stats_coloctarget rulesgroupstats.smk: Addedcontrast={pairwise_contrast}entity toperform_group_stats,create_stats_heatmap,map_groupstats_to_template_nii; removedconcat_subj_parquet_contrast,group_counts_per_voxel_contrast,group_coloc_counts_per_voxel_contrast,concat_subj_segstats_contrast,map_groupavg_segstats_to_template_niiperform_group_stats.py: Rewritten — statsmodels OLS + patsy design matrix for contrast SE, replaces directscipy.stats.ttest_indconcat_subj_parquet_contrast.py,concat_subj_segstats_contrast.pyBreaking change: old
--contrast_column/--contrast_valuesarguments are removed entirely.