-
Notifications
You must be signed in to change notification settings - Fork 18
📊 Copilot Token Usage Report2026-04-04 #1657
Description
Overview
Period: 2026-04-04T07:26Z to 2026-04-04T10:18Z
Runs analyzed: 4 Secret Digger (Copilot) runs (all had token data); 4 other Copilot-engine runs (no artifacts)
Total tokens: 752K across instrumented workflows
Estimated total cost: $1.16 (Copilot endpoint rates: $2.50/M input, $10/M output)
⚠️ Recurring failure pattern: The retry-loop failure in Secret Digger (Copilot) seen in the previous report has recurred. Run §23975973080 shows the identical stuck pattern — requests 2–7 all returnoutput_tokens = 2at identicalinput_tokens = 35,237— costing 3.5× a normal run ($0.62 vs. $0.18). This is now a confirmed recurring issue.
📉 Documentation Maintainer absent: No
doc-maintainerruns observed this period (previously $3.91/run). Total cost is significantly lower as a result.
Workflow Summary
| Workflow | Runs | Requests | Total Tokens | Est. Cost | Cache Rate | I/O Ratio | Top Model |
|---|---|---|---|---|---|---|---|
| Secret Digger (Copilot) | 4 | 15 | 752K | $1.16 | 33.6% | 498:1 | sonnet-4.6 |
| Total | 4 | 15 | 752K | $1.16 | 33.6% | 498:1 |
🔍 Optimization Opportunities
-
Secret Digger (Copilot) — recurring retry loop (confirmed bug)
- Second occurrence in two consecutive daily reports
- Run §23975973080 (failure): 7 requests vs. normal 2
- Requests 2–7: all
output_tokens = 2and identicalinput_tokens = 35,237— a stuck loop producing no progress - Wasted cost: $0.62 vs. $0.18 normal (3.5× overhead); by request 3 cache is fully warm (
cache_read = 35,236 = input_tokens) yet output doesn't improve - Recommendation: Add a max-iterations guard or detect when consecutive outputs are identical/minimal; escalate to workflow maintainer
-
Cache hit rate plateau at ~30% for normal success runs
- All 3 success runs show exactly 29.7% cache hit rate — consistent but not improving
- Request 1 of each run starts cold (0 cache reads), request 2 hits ~29K cached tokens
- The first request always incurs full input cost; caching only kicks in from request 2+
- Recommendation: Evaluate whether the static system prompt prefix can be pre-warmed, or structure prompts to maximize the cached prefix length
-
Cache write tokens remain 0 (provider limitation)
- Consistent across all runs — the Copilot inference endpoint doesn't report
cache_write_tokensseparately - Cost estimates may be slightly underestimated (missing cache write billing)
- Status: Known limitation, no action needed
- Consistent across all runs — the Copilot inference endpoint doesn't report
Per-Run Details
Secret Digger (Copilot)
| Run | Conclusion | Requests | Input | Output | Cache Read | Total | Cache Rate | Cost |
|---|---|---|---|---|---|---|---|---|
| §23976844295 | ✅ success | 2 | 69,845 | 277 | 29,501 | 99,623 | 29.7% | $0.18 |
| §23975973080 | ❌ failure | 7 | 246,189 | 417 | 205,677 | 452,283 | 45.5% | $0.62 |
| §23975039884 | ✅ success | 2 | 69,977 | 417 | 29,495 | 99,889 | 29.7% | $0.18 |
| §23974187599 | ✅ success | 2 | 69,980 | 445 | 29,499 | 99,924 | 29.7% | $0.18 |
Normal run profile (avg of 3 success runs):
- Requests: 2 (request 1 cold, request 2 warm)
- Tokens: ~99.8K (70K input, ~380 output, ~29.5K cache_read)
- Avg latency: ~6,300ms/request
- Cache hit rate: 29.7% (second request only)
- Cost per run: ~$0.18
Failure run analysis (7 requests):
- Request 1: 34,767 input, 405 output, 0 cache_read — normal start
- Requests 2–7: 35,237 input, 2 output, 29,497–35,236 cache_read — stuck loop
- Cache fully warm by request 3 but model output frozen at 2 tokens
- Additional 5 wasted calls: ~$0.44 overhead
Workflows Without Token Data
The following Copilot-engine workflows ran but produced no agent-artifacts with token data:
| Workflow | Run | Conclusion | Reason |
|---|---|---|---|
| Smoke Copilot | §23974551292 | ❌ failure | No agent-artifacts uploaded (likely failed before AWF started) |
| Smoke Services | §23974551273 | ❌ failure | No agent-artifacts uploaded |
| Build Test | §23974551276 | ❌ failure | No agent-artifacts uploaded |
| Agentic Maintenance | §23975583864 | ✅ success | No agent-artifacts (may not use --enable-api-proxy) |
Note: The three failures (Smoke Copilot, Smoke Services, Build Test) all triggered from the same PR push at 07:48Z and appear to have failed at the AWF/infrastructure level rather than consuming LLM tokens.
Historical Trend
| Period | Workflow | Runs | Total Tokens | Cost | Cache Rate | Notes |
|---|---|---|---|---|---|---|
| 2026-04-04 | Secret Digger (Copilot) | 4 | 752K | $1.16 | 33.6% | 1 failure (retry loop) |
| 2026-04-03 | Secret Digger (Copilot) | 5 | 850K | $1.33 | 38.1% | 1 failure (retry loop) |
| 2026-04-03 | Documentation Maintainer | 1 | 3,003K | $3.91 | 48.7% | First appearance |
| 2026-04-03 | Total | 6 | 3,853K | $5.24 | 46.0% |
Trend observation: Without the Documentation Maintainer running today, total cost dropped from $5.24 to $1.16 (-78%). The Secret Digger (Copilot) per-run cost is stable at ~$0.18 for successful runs. The retry-loop failure pattern has now occurred in two consecutive daily reports — this warrants immediate attention.
Previous Report
📊 Copilot Token Usage Report 2026-04-03
References:
- §23976844295 — Latest Secret Digger (Copilot) success
- §23975973080 — Failure run with retry loop
- §23976915564 — This report's workflow run
Generated by Daily Copilot Token Usage Analyzer · ◷