Add autotrack — autonomous MOT tracker optimization loop [rebase&merge]#346
Draft
Add autotrack — autonomous MOT tracker optimization loop [rebase&merge]#346
autotrack — autonomous MOT tracker optimization loop [rebase&merge]#346Conversation
- experiments/program.md: autoresearch contract — research question, HOTA≥60 target, hard boundaries, 7 research starting points (Kalman P/R init, two-threshold association, velocity attenuation, etc.) - experiments/optimize_tracking.py: Optuna-based metric runner; n_trials=1 evaluates defaults; multi-core via multiprocessing+SQLite; agent updates search space as architecture evolves - experiments/README.md: motivation, approach, target analysis (HOTA ceiling derivation), pre-flight checks, references - pyproject.toml: add `optimize` dependency group (optuna[rdb], fire) --- Co-authored-by: Claude Code <noreply@anthropic.com>
autotrack/optimize_tracking.py - --det-tag TAG CLI arg: overrides the directory suffix for any custom detector without touching _DET_SOURCE_TO_TAG; _validate_args and _resolve_sequences both accept it - Multiprocessing progress bar: replaced pool.starmap with starmap_async + a polling loop that loads the SQLite study every 2 s and feeds a Rich Progress bar showing completed trials and live best HOTA (mirrors the existing single-worker callback approach) - Module docstring updated with --det-tag usage example autotrack/README.md - Fixed cd experiments → cd autotrack; old --tracker sort --fast → positional syntax - YOLO section replaced with YOLOX section (correct weights filename) - RF-DETR section added as a standalone step - New Custom detections section: dir layout, MOT format, --det-tag usage - Pre-flight checks table updated (removed API key row, fixed commands) - Fixed /optimize campaign experiments/ → autotrack/ - Fixed broken Files table row for optimize_tracking.py autotrack/program.md - generate_detections.py added to scope_files - Weights filename corrected (yolox_x.pth → bytetrack_x_mot17.pth.tar) - RF-DETR and custom detector quickstart notes added below pre-flight table --- Co-authored-by: Claude Code <noreply@anthropic.com>
- generate_detections.py: remove YOLOX backend (loader, predictor, frame processing); add YOLO-World via inference-models with center→top-left coord conversion; rename rfdetr-l → rfdetr/l to match yolo_world/l slash notation - optimize_tracking.py: swap yolox→yoloworld in _DET_SOURCE_TO_TAG; extract _run_parallel_study; fix multiline ternaries to if/else; use setattr() for dynamic Kalman attrs (mypy); pass >3 args as kwargs - best_config.json: drop broken yolox entry (HOTA=7.7); add real Optuna results for yoloworld, rfdetr, dpm across all three trackers - pyproject.toml: remove YOLOX git source + no-build-isolation; add inference-models>=0.19.0 --- Co-authored-by: Claude Code <noreply@anthropic.com>
- search_space.json: expand 16 boundary-hugging parameters across all three trackers (lost_track_buffer, track_activation_threshold, minimum_iou_threshold, high_conf_det_threshold, q_scale/r_scale/p_scale, velocity_decay, q_miss_alpha, max_interpolation_gap, p_reset_threshold, direction_consistency_weight); add log=true to lost_track_buffer (all trackers) and minimum_iou_threshold (all trackers) - optimize_tracking.py: pass log= to suggest_int so log-scale int parameters are respected - best_config.json: bytetrack/rfdetr updated to HOTA 45.08 from new run - uv.lock: regenerated after yolox removal --- Co-authored-by: Claude Code <noreply@anthropic.com>
…mation (ORU) - Add oru_enabled parameter to ByteTrackKalmanBoxTracker: on re-detection after occlusion, replay virtual predict+update cycles along linearly interpolated trajectory to re-estimate velocity - Expose oru_enabled in optimize_tracking.py _build_tracker and _define_search_space - Add oru_enabled to default_config.json and search_space.json --- Co-authored-by: Claude Code <noreply@anthropic.com>
…0.05) - Add stage2_iou_threshold=0.05 param to ByteTrackTracker; stage-1 keeps minimum_iou_threshold=0.1 - Lower stage-2 threshold recovers more low-confidence detections without breaking high-conf stage - Expose to Optuna via search_space.json; add to default_config.json and optimize_tracking.py --- Co-authored-by: OpenAI Codex <codex@openai.com>
…larity - Add iou_age_weight=0.03: scale stage-1 IoU similarity by 1/(1+w*lost_frames) for each track - Biases Hungarian assignment toward recently-seen tracks; reduces stale-prediction false matches - iou_age_weight=0.03 is active at default params; Optuna range [0.0, 0.2] log-scale --- Co-authored-by: Claude Code <noreply@anthropic.com>
- Apply age discount only to cost matrix (not threshold check): raw IoU used for min-threshold gate, discount only biases solver assignment toward active tracks - Tighten Optuna search range [0.0, 0.2] -> [0.0, 0.1] - Fix pre-existing bug: optimize_tracking.py final re-eval now applies _apply_kalman_patch --- Co-authored-by: Claude Code <noreply@anthropic.com>
Apply Optuna-found parameter values as new defaults: lost_track_buffer 30→62, track_activation_threshold 0.7→0.314, q_scale 0.01→0.00246, r_scale 0.1→0.292, p_scale 1.0→7.34, velocity_decay 0.95→0.817, q_miss_alpha 0.1→0.461, max_interpolation_gap 20→30, p_reset_threshold 5→13; HOTA 56.781→57.424 (+1.13%) --- Co-authored-by: Claude Code <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces the new autotrack/ workflow for autonomous + Optuna-based optimization of MOT17 trackers, and updates core tracker internals to support additional post-processing and association/Kalman behaviors that the optimization loop can tune and validate.
Changes:
- Added
autotrack/tooling: Optuna runner (optimize_tracking.py), detection generation (generate_detections.py), visualization utilities, and configuration/artifact files (default_config.json,search_space.json,best_config.json,program.md). - Extended ByteTrack and SORT utilities with new association / Kalman mechanics and MOT-gap interpolation.
- Added an
optimizedependency group and adjusted repo formatting/ignore configs to support the new workflow.
Reviewed changes
Copilot reviewed 17 out of 19 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| trackers/core/sort/utils.py | Adds MOT-format short-gap interpolation helper used by autotrack evaluation output. |
| trackers/core/bytetrack/tracker.py | Adds stage-2 IoU threshold and IoU age discount for stage-1 ranking; updates association gating logic. |
| trackers/core/bytetrack/kalman.py | Adds velocity decay, miss-noise inflation, P-reset, and ORU mechanics to ByteTrack Kalman tracker. |
| README.md | Badge formatting change (single-line). |
| pyproject.toml | Adds optimize dependency group and uv git source for onnx-simplifier. |
| docs/trackers/ocsort.md | Reflowed paragraph formatting. |
| docs/trackers/comparison.md | Reflowed admonition formatting. |
| CODE_OF_CONDUCT.md | Reflowed paragraph formatting. |
| autotrack/visualize_detections.py | New utility to render MOT detections on frames. |
| autotrack/search_space.json | New Optuna parameter search space definitions per tracker. |
| autotrack/README.md | New documentation for the autotrack workflow and benchmarks. |
| autotrack/program.md | New campaign contract/spec for the autonomous optimization loop. |
| autotrack/optimize_tracking.py | New Optuna study runner + evaluation harness using trackers.eval. |
| autotrack/generate_detections.py | New script to generate MOT17 detections via RF-DETR / YOLO-World backends. |
| autotrack/default_config.json | New baseline/default parameter set for --n-trials 1 runs. |
| autotrack/best_config.json | New committed “best known” tuned configs used for warm-starting/guarding. |
| .pre-commit-config.yaml | mdformat configured with --wrap=no (drives markdown reflow behavior). |
| .gitignore | Adjusts ignores (including .python-version) and adds autotrack output/cache patterns. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
07f7488 to
a62024a
Compare
…ecovery Short occlusions (1-4 frames) are handled well by velocity decay alone; ORU trajectory replay is beneficial only for longer gaps where velocity has drifted. HOTA 57.424→57.813 (+0.686%), IDF1 69.573→70.009 --- Co-authored-by: Claude Code <noreply@anthropic.com>
- bytetrack/sdp Optuna result: 58.753 (was 56.115 before i10-i11) - New optimal params include oru_threshold=14, q_scale/r_scale/p_scale all ~10x lower --- Co-authored-by: Claude Code <noreply@anthropic.com>
- q_scale 0.00246→0.000202, r_scale 0.292→0.0441, p_scale 7.34→0.731 (tighter Kalman — trust measurements more) - oru_threshold 5→14, velocity_decay 0.817→0.774, q_miss_alpha 0.461→0.282 - stage2_iou_threshold 0.05→0.233, lost_track_buffer 62→52, p_reset_threshold 13→26 - HOTA 57.813→58.753 (+1.30%) --- Co-authored-by: Claude Code <noreply@anthropic.com>
- Confidence boost in Hungarian cost: solver_iou *= (1 + w * conf[det]) - Neutral at all tested defaults (0.0–0.5); added to Optuna search space [0.0, 1.0] - IDSW improved 297→293 at w=0.3 but HOTA regressed; w=0.1 exactly neutral --- Co-authored-by: Claude Code <noreply@anthropic.com>
- Mature-track-only stage-2: only tracks with >= N updates participate in low-conf recovery - Neutral at N=0,1; regresses at N>=2 — ghost exclusion hurts legitimate young tracks - Added to Optuna search space [0, 5] for future joint optimisation --- Co-authored-by: Claude Code <noreply@anthropic.com>
699e62f to
1bc1138
Compare
…disabled) - Add _giou_matrix() helper and giou_blend param to ByteTrackTracker stage-1 cost - giou_blend=0.0 default keeps metric at 58.753 (best found 0.32 gave +0.092%, below 0.1% threshold) - Add giou_blend to search_space.json [0.0, 1.0] and optimize_tracking.py wiring - Fix best_config.json trailing newline --- Co-authored-by: Claude Code <noreply@anthropic.com>
…earch) - 1000-trial Optuna search over expanded search space (new: conf_cost_weight, stage2_min_updates, giou_blend) - HOTA 58.753→58.862 (+0.185%), IDSW 297→269 (-9.4%) - Key changes: high_conf_det_threshold 0.608→0.795, oru_threshold 14→0, Kalman looser (q_scale/r_scale ~14x), minimum_consecutive_frames 2→1, stage2_min_updates 5, giou_blend 0.396, conf_cost_weight 0.170 --- Co-authored-by: Claude Code <noreply@anthropic.com>
- HOTA 58.862→58.961 (+0.168%), IDSW 269→266, IDF1 71.365→71.730 - Optuna search was capped at stage2_min_updates≤5; manual scan found peak at 12 (cliff at 14+) - Widen search_space.json high: 5→15 so future guard runs can explore the full range --- Co-authored-by: Claude Code <noreply@anthropic.com>
--- Co-authored-by: Claude Code <noreply@anthropic.com>
- HOTA 58.961→59.031 (+0.119%), IDSW 266→262, IDF1 71.730→71.852 - max_interpolation_gap 45→48 (Optuna undershoot, true peak at 48) - giou_blend 0.3963→0.42 (refined from 0.396 Optuna result) - velocity_decay 0.827→0.82 (slight tightening of decay) --- Co-authored-by: Claude Code <noreply@anthropic.com>
… vel=0.82) --- Co-authored-by: Claude Code <noreply@anthropic.com>
- HOTA 59.031→59.092 (+0.103%), IDSW 262→259, IDF1 71.852→71.993 - minimum_iou_threshold 0.1545→0.146 (Optuna undershoot near discontinuity) - p_scale 1.756→2.5, q_scale 0.002819→0.003 (Kalman covariance fine-tuning) --- Co-authored-by: Claude Code <noreply@anthropic.com>
…5, q_scale=0.003) - bytetrack/sdp HOTA 59.031→59.092 (+0.103%): i19 default params confirmed - Guard passed: bytetrack 59.031 (-0.000%), sort -0.000%, ocsort -0.208% (all within 0.5% threshold) --- Co-authored-by: Claude Code <noreply@anthropic.com>
adbf00a to
acb15b0
Compare
- Add Journal › ByteTrack section: 10-row experiment table (kept iterations), collapsed descriptions block, code features table, failed experiments list, key lesson - Fill SDP + autotrack + Optuna row: HOTA 59.092, IDF1 71.993, MOTA 66.977, IDSW 259 --- Co-authored-by: Claude Code <noreply@anthropic.com>
15fce38 to
cf03dd4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
MOT tracker quality depends on two largely independent axes: algorithm design and hyperparameter tuning. Most published improvements conflate them — a well-tuned weaker algorithm routinely beats a poorly-tuned stronger one, making it hard to isolate what actually matters. This PR separates the axes by adding
autotrack/, an autonomous optimization loop for SORT, ByteTrack, and OC-SORT on MOT17.The goal is both practical (better trackers, reproducible tuning) and scientific (the experiment log — including every reverted change — is itself a research artifact).
Approach
Three progressive layers build on each other:
Layer 1 — SOTA trackers with solid defaults. The existing
trackers/core/implementations of SORT, ByteTrack, and OC-SORT are already competitive out of the box. This layer is the foundation;autotrack/does not replace it.Layer 2 — Optuna extracts the best from the existing parameter surface.
optimize_tracking.pyruns an Optuna study over the tracker's exposed hyperparameters (Kalman noise scales, confidence thresholds, buffer sizes). No code changes — pure tuning. FRCNN results gain 1–2.5 HOTA points; SDP gains 2–4 points. This layer alone is useful as a standalone tuning tool and can be adopted without running the agent loop.Layer 3 — autotrack goes beyond tuning by making algorithmic improvements. This is the novel contribution. An autonomous agent iterates over structural code changes (state representation, association strategy, camera motion compensation, Kalman mechanics), measures HOTA at fixed default parameters after each change, keeps improvements, and reverts regressions. Optuna acts as a second-pass validator after each kept change to confirm the improvement is real and not a tuning artifact. The iteration log is JSONL and captures every attempt, kept or reverted.
Two tools govern the loop:
optimize_tracking.py --n-trials 1optimize_tracking.py --n-trials Nbest_config.json, validates tuned ceilingThe agent is explicitly permitted to update
optimize_tracking.pyas the tracker architecture evolves — adding parameters that newly exist, removing ones absorbed into the implementation, tightening search ranges as knowledge accumulates.Benchmarks
MOT17-val, full 7-sequence eval.
Defaults= fixed params fromdefault_config.json, no tuning.+Optuna= n=500 trials.+autotrack + Optuna= in progress.FRCNN public detections (bundled, no GPU)
SDP public detections (bundled, no GPU)
Estimated ceiling with code improvements + Optuna on FRCNN: ~61.9 HOTA (vs ~56.0 for tuning alone), derived from the DetA/AssA decomposition — DetA is bounded by the detector (~0.57–0.62 for FRCNN), but AssA has substantial headroom from ~0.55 to ~0.65 via better association logic.
Hard guarantees
Three invariants are enforced by
program.mdand cannot be relaxed by the agent:det/det.txt.gt/gt.txtis never accessed at inference time.trackers.evalonly.trackers/eval/is out of scope for agent edits. The metric computation is identical across all iterations; the agent cannot move the goalposts.Quick start
To run the autonomous agent loop, point any coding agent at
program.md:claude > Read program.md and start the experiment loop.References