fix: execute review: blocks on type: string step outputs (#350) by nhorton · Pull Request #351 · Unsupervisedcom/deepwork

nhorton · 2026-04-08T23:56:30Z

Summary

Fixes Bug: string-type output review blocks are silently not executed #350: review: blocks on type: string step outputs were silently ignored because the review pipeline only matched ReviewRule patterns against file paths.
String outputs with review blocks now produce synthetic ReviewTask objects carrying the value on a new ReviewTask.inline_content field, rendered into the instruction file as a "Content to Review" section.
Pass-caching still works: the review_id content hash mixes in the inline value, so distinct values get distinct cache keys (cache invalidates on change, persists when unchanged).

Why this matters

The job_yml DeepSchema actively directs authors toward type: string for transient inter-step data (no-temporary-files-for-transient-data). Authors following that guidance and also declaring a review: block on the string output were getting a misconfigured-but-silent quality gate. This fix closes that loophole so narrative summaries, counts, and status reports declared on string outputs are actually reviewed.

What changed

ReviewTask.inline_content: str | None field
build_string_output_review_tasks() in quality_gate.py, wired into run_quality_gate
Shared _build_preamble() helper (deduplicates common_job_info + inputs logic between file and string paths)
compute_review_id / _paths_component / _content_hash honor inline_content
build_instruction_file omits "Files to Review" for inline tasks and emits "Content to Review"
format_for_claude task names read review of inline content instead of review of 0 files (follow-up commit)
New requirements: JOBS-REQ-004.8, REVIEW-REQ-005.1.8, REVIEW-REQ-009.1.7
19 new tests (unit + end-to-end via run_quality_gate, plus formatter rendering)
Bespoke smoke-test job at .deepwork/jobs/test_job/ exercising all four review-level combinations (file_path + output-ref review, string + output-ref review, string + arg-level review, string + no review)

Test plan

uv run pytest — 1250 passed, 1 skipped, 1 xfailed
uv run ruff check on changed files — clean
uv run mypy on changed source files — clean
New end-to-end tests verify: string review produces instructions, value is inlined in the instruction file, pass-caching honors unchanged values, cache invalidates on value change
Manual smoke test via .deepwork/jobs/test_job/test_review_system workflow — produced 3 review tasks (file_path + 2 string outputs) plus the control string output correctly skipped. Verified "Content to Review" sections contain the actual string values and distinct review_id hashes for the two inline tasks.
/deepwork review on the branch — 23 review tasks all passed cleanly (python_code_review, test_file_quality, requirements_file DeepSchema, CHANGELOG DeepSchema, python_lint, requirements_traceability, etc.)

🤖 Generated with Claude Code

Previously, review: blocks declared on type: string outputs were silently dropped because the review pipeline only matched ReviewRule patterns against file paths. Authors got no warning and the review appeared correctly configured. This was especially problematic given that the job_yml DeepSchema directs authors toward type: string for transient inter-step data. String outputs with review blocks now produce synthetic ReviewTask objects with the value carried on a new ReviewTask.inline_content field. The value is rendered into the instruction file as a "Content to Review" section and mixed into the review_id content hash so each distinct value gets its own pass-cache key (cache invalidates on value change, persists when unchanged). Adds: - ReviewTask.inline_content field - build_string_output_review_tasks() in quality_gate, wired into run_quality_gate - Shared _build_preamble() helper to deduplicate common_job_info+inputs logic - Requirements: JOBS-REQ-004.8, REVIEW-REQ-005.1.8, REVIEW-REQ-009.1.7 - 18 new tests (unit + end-to-end via run_quality_gate) Fixes #350 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…iles") Follow-up to the issue #350 string-output review fix. The formatter's _task_name() used len(files_to_review) to describe task scope, which fell through to the multi-file branch for inline-content tasks and produced "review of 0 files" — cosmetic wart visible in the parallel-task listing. Also adds a bespoke test_job (.deepwork/jobs/test_job/) that exercises the full quality gate end-to-end with four outputs covering every review-level combination: file_path + output-ref review, string + output-ref review (NEW), string + arg-level review (NEW, inherited), and string + no review (control). Used as a manual smoke test when verifying the fix. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

CI runs `ruff format --check`, which my local `uv run ruff check` calls did not exercise (check runs the linter, not the formatter). Two lines I had manually broken up were re-joined by ruff format. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…o-fix) Previously `make lint` ran `ruff format` and `ruff check --fix` — both in auto-fix mode — so it silently mutated files instead of reporting issues. CI, on the other hand, runs `ruff format --check` and `ruff check` in check-only mode, which is what actually fails a PR. That asymmetry meant a clean local `make lint` gave no guarantee that CI would pass: running it would just reformat files in-place, and the developer might miss the uncommitted changes. Splits into two targets: - `make lint` now mirrors CI (ruff format --check, ruff check, mypy). A clean run guarantees the CI Lint job will pass. - `make lint-fix` is the previous behavior (auto-fix formatter + linter) for convenient local cleanup. Caught while working on #350 — my local `ruff check` calls did not run the formatter at all, and even `make lint` would have only silently reformatted the files, leaving a surprise CI failure. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

nhorton and others added 4 commits April 8, 2026 17:54

nhorton added this pull request to the merge queue Apr 9, 2026

Merged via the queue into main with commit ce50d05 Apr 9, 2026
5 checks passed

nhorton deleted the fix/string-output-reviews-issue-350 branch April 9, 2026 00:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: execute review: blocks on type: string step outputs (#350)#351

fix: execute review: blocks on type: string step outputs (#350)#351
nhorton merged 4 commits intomainfrom
fix/string-output-reviews-issue-350

nhorton commented Apr 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nhorton commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why this matters

What changed

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

nhorton commented Apr 8, 2026 •

edited

Loading