Add CLAUDE.md and initial .deepreview rule suite#2
Conversation
Adds project guidance for future Claude Code instances and configures four DeepWork review rules so this repo dogfoods the action it ships: - prompt_best_practices — reviews CLAUDE.md and prompts/review.txt against Anthropic prompt-engineering best practices. - update_action_surface_docs — keeps README.md and CLAUDE.md in sync with action.yml, prompts/review.txt, scripts/post-review-comments.py, and the example workflow. Instructions enumerate 12 high-risk drift points (inputs table, end-to-end flow steps, embedded example workflow, security claims, caching path, comment grouping, no-commit guarantees, state-files JSON schema, bot-identity literal, repo layout, etc). - python_code_review — reviews scripts/post-review-comments.py against conventions extracted from the existing code (.deepwork/review/python_conventions.md), with required DRY and comment-accuracy checks. - suggest_new_reviews — meta-rule that proposes new review rules per change. Also gitignores .deepwork/tmp/ since it's restored from GitHub Actions cache at runtime and regenerated by every /review run. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Expand the "Versioning the action" section to explicitly document that v1 is a floating major-version tag that always points at the latest commit on main (standard GitHub Actions convention, e.g. actions/checkout@v4). Flags that release automation is planned and until then v1 must be force-moved manually on every merge to main. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The previous action.yml was wired to anthropics/claude-code-base-action@beta with plugin_marketplaces/plugins/claude_args inputs that never existed on that action — in any published release. At runtime those inputs were silently dropped, so the DeepWork plugin was never installed, --model and --dangerously-skip-permissions were never applied, and Claude was running the built-in /review slash command on default Sonnet with default permissions. The "review" was plain Claude freelancing with no .deepreview rule awareness. Switch to anthropics/claude-code-action@v1, which: - has real plugins: and plugin_marketplaces: inputs (newline-separated lists) documented in its action.yml - has claude_args: for passing --model / --max-turns / etc. - commits and pushes file edits back to the PR branch natively via use_commit_signing: false - provides mcp__github_inline_comment__create_inline_comment as a native MCP tool, replacing the custom post-review-comments.py - supports track_progress: true for a live progress comment on the PR - has a floating v1 tag maintained by upstream As a result this PR deletes a lot: - scripts/post-review-comments.py (202 lines of diff-and-post logic) - the Install uv composite step (dead code — nothing used uv) - the Fetch base branch for git diff step (claude-code-action handles diffing itself) - the Prepare review run step (no more /tmp/deepwork_changes.json contract) - the Commit and push changes step (claude-code-action commits natively) - the Post inline PR review comments step (native MCP tool) prompts/review.txt loses the /tmp/deepwork_changes.json tracking schema and gains instructions to use the native inline-comment MCP tool with confirmed: true. .github/workflows/example.yml simplifies: shallow fetch-depth: 1, id-token: write permission for OIDC, drops the misconfigured `if: github.actor != 'deepwork-action[bot]'` self-trigger guard (GITHUB_TOKEN-pushed commits don't retrigger workflows per GitHub's built-in rule — the guard was both wrong and unnecessary). CLAUDE.md and README.md are rewritten to describe the new thin architecture. The .deepreview rule update_action_surface_docs has its drift-check list updated to pin the new contract (plugins, plugin_marketplaces, bot_name, claude_args, etc.) instead of the deleted contract (load_changes_by_file, /tmp/deepwork_changes.json schema, etc). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
7ae4379 to
a10cd5f
Compare
The previous self-review loop couldn't test action.yml edits: the only workflow (.github/workflows/example.yml) pinned `uses: Unsupervisedcom/deepwork-action@v1`, which resolves to the v1 tag on main — not the PR branch. Every PR just kept running the old broken base-action composite from main, and committed a fresh output.txt execution log to the PR branch via the old commit step. Split the two roles: - examples/deepwork-review.yml — pristine copy-pasteable reference for external consumers. Pins @v1. Lives outside .github/workflows/ so GitHub doesn't auto-execute it (it's documentation, not CI). - .github/workflows/self-review.yml — this repo's own CI. `uses: ./` so the PR branch's own action.yml is exercised. This is how action.yml edits actually get tested before being tagged. README now links to examples/deepwork-review.yml. CLAUDE.md Repository Layout and Testing Changes sections are updated to explain the split. .deepreview's update_action_surface_docs rule now tracks examples/deepwork-review.yml instead of the old .github path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fa59066 to
b751619
Compare
|
Claude encountered an error —— View job DeepWork Review
|
First real run of claude-code-action@v1 from the PR branch got blocked on both permissions and turn budget: - The upstream action configures a read-only allowedTools list for pull_request events (Glob, Grep, LS, Read, git add/commit/push, update_claude_comment, CI tools). No Edit/Write/MultiEdit/Task or mcp__github_inline_comment__create_inline_comment. Claude tried to edit files, hit permission_denials_count: 2, and burned turns fighting denials. Pass --dangerously-skip-permissions via claude_args to bypass the allowlist — this now actually works because the real claude-code-action (unlike the base action) accepts claude_args. - Hit error_max_turns at 51 turns. DeepWork /review dispatches parallel sub-agent Tasks for each rule, and each reviewer eats turn budget. Default max_turns=50 is tight for multi-rule repos. Bump to 100. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Claude finished @nhorton's task in 8m 55s —— View job DeepWork Review
Changes applied across 2 commits:
Review results by rule:
|
- prompts/review.txt: Add false-positive escape valve to rule 2; add two-cycle convergence limit to rule 3; add rule 5 prohibiting git write commands; add example inline comment format; strengthen commits/pushes section - CLAUDE.md: Document --dangerously-skip-permissions in end-to-end flow; update prompt contract to reflect strengthened rules - .deepwork/review/python_conventions.md: Remove stale references to deleted scripts/post-review-comments.py - .deepreview: Fix stale "change-tracking JSON contract" reference; add tool-unavailable fallback to suggest_new_reviews - .gitignore: Generalize scripts/__pycache__/ to __pycache__/ Co-authored-by: Noah Horton <undefined@users.noreply.github.com>
The previous commit added a 5th guarantee to CLAUDE.md's prompt contract section but did not update the .deepreview drift check that counts them. Co-authored-by: Noah Horton <undefined@users.noreply.github.com>
The first successful claude-code-action@v1 run made substantive autofix commits but never invoked mcp__github_inline_comment__create_inline_comment — or if it did, the comments never appeared on the PR. The upstream action's SDK wrapper hides Claude's tool calls by default (show_full_output=false) and buffers/classifies inline comments before posting them (classify_inline_comments=true), so we can't tell which path is at fault from the log alone. Flip both while we diagnose: - show_full_output: true — exposes Claude SDK tool calls in the runner log. WARNING: may expose secrets in tool output; only safe in trusted repos. Revert to false once the inline-comment path is working. - classify_inline_comments: false — post every inline comment Claude creates instead of buffering and classifying real-vs-probe. This rules out the classifier dropping them silently. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Claude finished @nhorton's task in 7m 50s —— View job DeepWork Review
Changes applied:
Review results by rule:
|
The previous prompt rule 5 ("NEVER run git write commands — the wrapping
GitHub Action handles all VCS operations") was based on a wrong mental
model of anthropics/claude-code-action@v1. That action does NOT
auto-commit; it pre-allows a small set of git tools and the upstream
system prompt expects Claude to commit and push itself. Our rule 5 told
Claude not to, so the most recent run made file edits, reported them as
"applied" in the tracking comment, and never persisted them to the branch.
Reverse it: Claude MUST commit and push using
- git add <files>
- git commit -m "<message>"
- /home/runner/work/_actions/anthropics/claude-code-action/v1/scripts/git-push.sh origin HEAD
Other changes in the same prompt rewrite:
- Drop the entire "Posting inline PR comments" section. The upstream
claude-code-action@v1 system prompt explicitly forbids creating new
PR comments on pull_request events ("Never create new comments. Only
update the existing comment using mcp__github_comment__update_claude_comment")
— our prompt was fighting it. The mcp__github_inline_comment server
isn't loaded in this action's context anyway. Lean into the tracking
comment as the sole output surface.
- Add a new "Reporting your work" section instructing Claude to use the
upstream-managed tracking comment for per-rule findings and commit SHAs.
action.yml: revert show_full_output and classify_inline_comments debug
flags to upstream defaults. show_full_output=true was leaking secrets
into public runner logs.
README.md and CLAUDE.md: rewrite "Review Comments"/"How It Works" to
describe the tracking-comment-only output surface and Claude's commit
responsibility. Add a "Known issues" section to CLAUDE.md documenting
the DeepWork plugin MCP server failing to start in the action's runner
(plugin installs successfully but MCP server reports status: failed).
Also document the PR file restoration security feature.
.deepreview: update update_action_surface_docs drift checks #6 and #12
to reflect the new contract (single tracking comment, no inline; Claude
commits via git tools, no "never run git" rule).
.claude/settings.json: commit the plugin-enabled settings file generated
by /plugin so the claude-code-action runner sees enabledPlugins after
the file restoration step. Speculative MCP-loading fix; backed up by
a research agent dispatched in parallel.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Capture the full investigation of the DeepWork plugin MCP server failing to start inside anthropics/claude-code-action@v1, including: - Symptom (plugin install reports success; mcp_servers init payload reports plugin:deepwork:deepwork status: failed; silent, no error message; same plugin works fine outside CI). - Why slash commands still work (/review is a skill file, no MCP needed) vs. what's missing (get_configured_reviews, mark_review_as_passed, start_workflow, DeepSchema validation, the workflow state machine). - Root-cause hypotheses ranked by probability: 1. 70% — PR file restoration wipes plugin MCP registration 2. 60% — No automatic plugin → session MCP merge path 3. 30% — MCP_TIMEOUT/MCP_TOOL_TIMEOUT empty env vars - Three open upstream issues that match our exact symptoms: - anthropics/claude-code-action#813 (silent MCP failures) - anthropics/claude-code-action#1004 (--mcp-config silently dropped) - anthropics/claude-code-action#95 (no plugin → session MCP merge path) - Definitive diagnostic experiment to confirm root cause #1. - Speculative fix logic for the .claude/settings.json file added in the rule-5 reversal commit (enables plugin at project scope; only effective for PRs opened after the file lands on main, because PR file restoration pulls from origin/main). - BLOCKING status: PR parked as draft until upstream fixes land. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
I'll analyze this and get back to you. |
Parking this PR as draft — blocked on upstream MCP issuesThe action ran end-to-end successfully in cycles 4 and 5 of this PR (commit Symptom
…but the Claude Code session init payload reports the plugin's MCP server as failed, while the two GitHub MCPs that ship with ```json Silent failure — no stack trace, no `command not found`, no `could not start`. The same plugin works fine outside CI. What still works vs what's missing
So the CI reviews running today are a degraded form: file-edit-based, no quality gates, no DeepSchema validation, no orchestrated workflow. They produce useful autofixes but skip all the structural-integrity guarantees that are the whole point of DeepWork. Root-cause hypotheses (ranked by probability)
Open upstream issues that match exactly
Other findings worth knowing about
StatusPR converted to draft until at least one of the upstream issues is resolved (or until we run the diagnostic experiment described in `CLAUDE.md` and submit a targeted upstream PR). Two cancelled runs (`24168751413`, `24168943402`) are because they would have hit the same MCP failure — no point burning more cycles until there's a fix to test. Full investigation details, the diagnostic experiment, and the rationale for the speculative fix are in the Known issues section of CLAUDE.md. |

Summary
CLAUDE.mdwith end-to-end flow documentation, the three state files crossing process boundaries (action.yml ↔ prompts/review.txt ↔ post-review-comments.py), the self-trigger guard, and how to test changes — so future Claude Code instances ramp up fast in this repo..deepreviewwith four review rules so this repo dogfoods the action it ships:prompt_best_practices— reviewsCLAUDE.mdandprompts/review.txtagainst Anthropic prompt-engineering best practices.update_action_surface_docs— keepsREADME.mdandCLAUDE.mdin sync withaction.yml,prompts/review.txt,scripts/post-review-comments.py, and.github/workflows/example.yml. Instructions enumerate 12 concrete drift points rather than vague "look for inaccuracies".python_code_review— reviews Python files against.deepwork/review/python_conventions.md(extracted frompost-review-comments.py's observable patterns), with required DRY and comment-accuracy checks.suggest_new_reviews— meta-rule that proposes new review rules per changeset..deepwork/tmp/(restored from GitHub Actions cache at runtime, regenerated by every/reviewrun — not source).Design notes
python_conventions.mdreflects patterns observable in the existing code rather than a generic style guide.Known issue surfaced
The DeepSchema PostToolUse hook for
**/.deepreviewreportsFile is not valid JSON: Expecting value: line 1 column 1 (char 0)on every Write/Edit. The file is valid YAML and conforms todeepreview_schema.json— the validator appears to be callingjson.loads()on a YAML file. False positive, file lands on disk correctly. Reported separately.Test plan
update_action_surface_docstriggers on changes to any of its 6 match patternspython_code_reviewtriggers onscripts/post-review-comments.pychangessuggest_new_reviewsproduces an empty (or near-empty) suggestion list on this initial run.deepwork/tmp/is correctly excluded from the commit history going forward🤖 Generated with Claude Code