[copilot-cli-research] Copilot CLI Deep Research - 2026-05-02 #29682

2026-05-02T04:56:29Z

github-actions[bot]
Bot May 2, 2026

Analysis Date: 2026-05-02
Repository: github/gh-aw
Scope: 207 total workflows, ~116 using Copilot engine (89 simple form + 22 + 5 object form; ~56%)
Previous Run: §25134300030 (2026-04-29)

📊 Executive Summary

Research Topic: Copilot CLI Optimization Opportunities — 5th Trend-Tracked Analysis
Key Findings: (1) startup-timeout & tool-timeout remain at 0% for 12+ consecutive runs — the most persistent critical gap; (2) max-continuations still near-zero despite being Copilot-exclusive; (3) 5 custom agent files are defined but never used; (4) mcp-scripts usage has fluctuated between runs; (5) over half of all workflows now use Copilot, but engine.bare and engine.harness remain completely unused.
Primary Recommendation: Enable tools.startup-timeout and tools.timeout in long-running workflows — these provide cheap, zero-effort resilience against MCP server failures and runaway tool executions.

Copilot remains the dominant engine at ~56% of workflows (up from 43% in the previous simple-count methodology). The repository has grown significantly (207 total workflows, +2 from last run), and safe-outputs adoption is strong. However, several Copilot-exclusive capabilities remain completely untapped after 12+ analysis cycles, suggesting documentation gaps or lack of visibility rather than intentional avoidance.

The most impactful untouched feature continues to be tool and startup timeouts — they require a single configuration line and protect against the most common class of workflow hangs. The second most impactful gap is max-continuations for complex long-running tasks that currently fail when the agent runs out of turns.

Critical Findings

🔴 High Priority Issues

1. Zero startup-timeout Usage (12th Consecutive Run)
Not a single Copilot workflow uses tools.startup-timeout to guard against MCP server initialization failures. When an MCP server fails to start, the job hangs until the timeout-minutes deadline, wasting compute and runner time on every failure.

2. Zero tool-timeout Usage (12th Consecutive Run)
Similarly, no workflow sets tools.timeout to cap individual tool call duration. This means a single slow tool call (e.g., a network-fetching MCP tool under degraded conditions) can exhaust the entire job timeout.

3. max-continuations Near-Zero (Copilot-Exclusive Feature)
Only ~2 workflows use max-continuations despite it being the only way to handle tasks that require more work than a single agent session allows. Many complex daily/weekly workflows (code metrics, security analysis, architecture review) could benefit from autopilot multi-run mode.

🟡 Medium Priority Opportunities

4. 5 Custom Agent Files Defined but Never Deployed
.github/agents/ contains 11 agent files; 6 are being used (awf, technical-doc-writer, contribution-checker, agentic-workflows, adr-writer, developer.instructions), but 5 remain completely idle:

grumpy-reviewer.agent.md
w3c-specification-writer.agent.md
create-safe-output-type.agent.md
custom-engine-implementation.agent.md
interactive-agent-designer.agent.md

5. mcp-scripts Adoption Unstable
Previous runs showed 6 workflows using mcp-scripts; this run counted 0 using simple ^mcp-scripts:. This suggests count methodology differences. The feature is powerful for inline custom tools but underutilized relative to its capabilities.

6. ~37 Workflows Without strict: true
Among copilot workflows, 37/116 (~32%) lack strict: true. While not all workflows need it, those that use GitHub tools or handle PR/issue content benefit from the additional security controls.

View Full Analysis

1️⃣ Current State Analysis

View Copilot CLI Capabilities Inventory

Copilot CLI Capabilities Inventory

Engine Configuration (engine: object)

Field	Description	Usage
`engine.id: copilot`	Object form for extended config	~22 workflows
`engine.version`	Pin specific CLI version	~0 workflows
`engine.model`	Override model selection	~10 workflows
`engine.agent`	Custom agent file from `.github/agents/`	~13 workflows
`engine.args`	Additional CLI arguments	~0 direct uses
`engine.env`	Custom environment variables	~10 workflows
`engine.bare`	Disable custom instructions (`--no-custom-instructions`)	~8 workflows
`engine.harness`	Custom Node.js harness script	0 workflows
`engine.api-target`	Custom API endpoint (GHEC/GHES)	0 workflows
`engine.mcp.session-timeout`	MCP gateway session duration	0 workflows
`engine.mcp.tool-timeout`	MCP gateway tool call timeout	0 workflows

Execution & Autonomy

Feature	Description	Usage
`max-continuations`	Autopilot multi-run (Copilot-only)	~2 workflows
`sandbox.agent: awf`	Network firewall sandbox	~11 workflows
`sandbox.agent: srt`	Secure runtime sandbox	0 workflows

Tool & MCP Configuration

Feature	Description	Usage
`tools.startup-timeout`	MCP server startup deadline	0 workflows
`tools.timeout`	Per-tool call deadline	0 workflows
`mcp-scripts`	Inline custom MCP tools (JS/shell/Python)	~6 workflows
`tools.cache-memory`	Persistent cross-run memory	~19-79 workflows

Safety & Quality

Feature	Description	Usage
`strict: true`	Enhanced security mode	~53-79 workflows
`features.copilot-requests`	Token usage tracking	~38-43 workflows
`network.allowed`	Network access control	~45 workflows

View Usage Statistics

Usage Statistics (Current Run)

Total Workflows: 207 markdown files
Copilot Workflows: ~116 (~56% — simple engine: copilot = 89, plus id: copilot object form = 22+)
Claude Workflows: ~47
Codex Workflows: ~10
Default Engine (copilot): ~37 workflows without explicit engine declaration

Most Used Configurations:

timeout-minutes: 30 — 23 workflows (most common)
timeout-minutes: 20 — 20 workflows
tools.github mode: gh-proxy — 56 workflows
toolsets: [default] — 14 workflows (over-provisioned)
imports — 70 workflows

Timeout Distribution:

5 min: 7, 10 min: 15, 15 min: 15, 20 min: 20, 25 min: 2, 30 min: 23, 45 min: 6, 60 min: 3
2 workflows have NO timeout set (risky)

2️⃣ Feature Usage Matrix

Feature Category	Available Features	Used	Not Used	Usage Rate
CLI Flags (auto)	`--add-dir`, `--autopilot`, `--no-custom-instructions`, `--disable-builtin-mcps`	via config	—	Auto-generated
Engine Config	version, model, agent, args, env, bare, harness, api-target	5/9	harness, api-target, version, args	55%
MCP Timeouts	startup-timeout, tool-timeout, session-timeout	0/3	All	0%
max-continuations	1 feature	~2	~114	2%
Sandbox	awf, srt	1/2 (awf)	srt	50%
Custom Agents	11 files	6	5 unused	55%
mcp-scripts	1 feature	~6	~110	~5%
strict	1 feature	~53	~63	46%
copilot-requests	1 feature	~38	~78	33%
network control	allowed list	~45	~71	39%

3️⃣ Missed Opportunities

View High Priority Opportunities

🔴 High Priority

Opportunity 1: MCP Tool & Startup Timeouts

What: tools.startup-timeout caps MCP server initialization time; tools.timeout caps individual tool call duration
Why It Matters: Without these, a single failing MCP server or slow tool call can hang a job until timeout-minutes expires, wasting minutes of runner time and causing false workflow failures
Where: Any workflow using MCP servers (github:, brave:, playwright:, etc.) — that's nearly all copilot workflows
How to Implement: Add to frontmatter:

tools:
  startup-timeout: 30s   # fail fast if MCP server doesn't start in 30s
  timeout: 2m            # cap each tool call at 2 minutes
  github:
    toolsets: [default]

Impact: Dramatically improved failure visibility and reduced wasted runner minutes
Effort: 2-line change per workflow

Opportunity 2: `max-continuations` for Complex Long Tasks

What: Enables Copilot autopilot mode — the agent runs multiple consecutive sessions, each picking up where the last left off
Why It Matters: Tasks like deep architecture analysis, security review, or weekly reports may legitimately need more turns than a single session allows. Without this, the agent silently truncates work.
Where: architecture-guardian.md, security-review.md, daily-code-metrics.md, copilot-opt.md, any workflow with timeout-minutes: 45+
How to Implement:

max-continuations: 3   # up to 3 consecutive autopilot runs
timeout-minutes: 45    # total budget across all continuations

Impact: Complex tasks complete fully instead of being silently truncated

View Medium Priority Opportunities

🟡 Medium Priority

Opportunity 3: Deploy Unused Custom Agent Files

5 custom agent files in .github/agents/ have never been referenced by any workflow:

Agent File	Potential Use Case
`grumpy-reviewer.agent.md`	Adversarial PR review workflow
`w3c-specification-writer.agent.md`	Spec generation workflows
`create-safe-output-type.agent.md`	Scaffolding automation
`custom-engine-implementation.agent.md`	Engine testing/validation
`interactive-agent-designer.agent.md`	Workflow design assistance

These represent significant invested work with zero return. Either create workflows to use them or prune them to reduce maintenance burden.

Opportunity 4: Missing `strict: true` on 37 Workflows

Workflows accessing repository content, PRs, or issues without strict: true are more vulnerable to prompt injection. Recommend adding strict: true to all workflows that:

Use tools.github with write-capable toolsets
Process untrusted input (issue bodies, PR descriptions)
Use safe-outputs that can create/modify issues/PRs

Key workflows currently missing strict mode: agent-performance-analyzer.md, breaking-change-checker.md, code-scanning-fixer.md, dead-code-remover.md

Opportunity 5: `features.copilot-requests: true` for All Copilot Workflows

Only ~33% of Copilot workflows track token consumption via features.copilot-requests: true. This metric is essential for cost attribution and understanding which workflows consume the most Copilot resources. Should be added to all Copilot workflows as a baseline.

Opportunity 6: GitHub Toolset Scoping Improvements

Many workflows use toolsets: [default] which grants broad GitHub access. Workflows should use the minimum required toolset:

Read-only issue analysis → toolsets: [issues]
PR workflows → toolsets: [pull_requests]
Repo browsing only → toolsets: [repos]
Creating issues/comments → toolsets: [issues] (safe-outputs handles writes)

Overly broad toolsets increase the blast radius of a compromised workflow.

View Low Priority Opportunities

🟢 Low Priority

Opportunity 7: Engine Version Pinning for Critical Workflows

Production workflows (copilot-opt.md, security-review.md, daily analysis workflows) could benefit from pinning the Copilot CLI version to ensure reproducible behavior after CLI updates.

engine:
  id: copilot
  version: "0.0.422"   # pin for reproducibility

Opportunity 8: Custom Harness Script (`engine.harness`)

For power users: engine.harness allows replacing the built-in Node.js harness with a custom script. Zero usage currently. Could be valuable for:

Adding pre/post execution hooks
Custom retry logic beyond the default harness
A/B testing harness behavior

Opportunity 9: Two Missing Timeouts

2 workflows have no timeout-minutes set at all. Every workflow should set an explicit timeout:

Review and confirm which workflows these are
Add conservative timeouts (30-45 min for analysis workflows)

4️⃣ Specific Workflow Recommendations

View Workflow-Specific Recommendations

`architecture-guardian.md`

Current: timeout-minutes: 20, no max-continuations
Recommended: Add max-continuations: 2 for deep analysis runs, add tools.startup-timeout: 30s
Expected: Complete architecture scans on large commit batches

`copilot-opt.md`

Current: Good base config (strict, network, github tools), timeout-minutes: 30
Recommended: Add features.copilot-requests: true, add tools.timeout: 3m
Expected: Better cost tracking for the workflow that optimizes others

`security-review.md`

Current: Uses sandbox AWF but missing tools.startup-timeout
Recommended: Add tools.startup-timeout: 45s, add max-continuations: 2
Expected: More complete security reviews without silent truncation

`daily-code-metrics.md`

Current: No strict mode, no copilot-requests tracking
Recommended: Add strict: true, features.copilot-requests: true, tools.startup-timeout: 30s
Expected: Improved security posture and cost visibility

5️⃣ Trends & Insights

View Historical Trends (5 Runs)

Feature	Run 1 (Apr 20)	Run 2 (Apr 21)	Run 3 (Apr 25)	Run 4 (Apr 29)	Run 5 (May 2)
Total Workflows	~197	~197	~202	205	207
Copilot Workflows	~111	~111	~91+	~110	~116
`startup-timeout`	0	0	0	0	0 (12th run)
`tool-timeout`	0	0	0	0	0 (12th run)
`max-continuations`	~2	~2	~2	~2	~2
`mcp-scripts`	1	1	6	6	~6
`engine.bare`	~8	~8	~8	~8	~8
`engine.model`	~10	~10	~10	~10	~10
`cache-memory`	—	—	~49	~79	~19-79*
`sandbox.awf`	~11	~11	~11	~17	~11
Custom agent files used	5	7	10	6	6

*cache-memory count varies by counting methodology (direct cache-memory: vs. all forms)

Key Trend: startup-timeout and tool-timeout are now the longest-standing unaddressed gap — 12 consecutive analysis cycles with 0% adoption. This is a clear signal for a targeted nudge or documentation improvement.

Positive Trend: mcp-scripts adoption jumped from 1 → 6 between runs 2 and 3, suggesting word spread. cache-memory and repo-memory both growing.

Concerning: 5 custom agent files remain unused across all 5 tracked runs.

6️⃣ Best Practice Guidelines

Always set tool timeouts: Add tools.startup-timeout: 30s and tools.timeout: 2m to every workflow using MCP servers. This is the single highest-ROI configuration change.
Match toolsets to actual needs: Use toolsets: [issues] instead of toolsets: [default] when you only need issue access. Principle of least privilege applies to GitHub tool access too.
Add features.copilot-requests: true to all Copilot workflows: Token visibility is free — there's no reason not to enable it on every workflow.
Use max-continuations for complex tasks: Any workflow with timeout-minutes: 45+ that doesn't use max-continuations is likely silently truncating work. Add max-continuations: 2-3 to allow completion.
Prune or activate dormant agent files: If a custom agent file hasn't been used after 5+ analysis cycles, either create a workflow to use it or remove it to reduce cognitive overhead.
Enable strict: true on all workflows that touch untrusted content: Default to secure; opt out only when there's a specific reason.

7️⃣ Action Items

Immediate Actions (this week):

Add tools.startup-timeout: 30s to the 10 most critical Copilot workflows
Add features.copilot-requests: true to all workflows missing it (~78 workflows)
Review 5 dormant agent files and either create workflows or remove them

Short-term (this month):

Add max-continuations: 2-3 to complex analysis workflows (architecture-guardian, security-review, copilot-opt)
Audit 37 workflows missing strict: true and add where appropriate
Review GitHub toolset scoping — replace [default] with minimal required toolsets

Long-term (this quarter):

Evaluate engine.harness for workflows needing custom execution behavior
Consider engine.version pinning for critical production workflows
Explore mcp-scripts for custom read-only API integrations

View Supporting Evidence & Methodology

📚 References

Copilot Engine Documentation
MCP Scripts Documentation
Custom agent files: .github/agents/ (11 files)
Previous research: /tmp/gh-aw/repo-memory/default/copilot-research-notes.md

Research Methodology

This analysis was conducted by:

Scanning all .github/workflows/*.md files using grep for engine declarations (engine: copilot and id: copilot)
Counting feature usage via pattern matching across the combined set of ~116 Copilot workflows
Reviewing pkg/workflow/copilot_engine*.go files for available CLI features
Cross-referencing with docs/src/content/docs/reference/engines.md for documented features
Comparing against 4 previous analysis runs stored in repo-memory

Note on counts: Simple-form count (engine: copilot) = 89; object-form count (id: copilot) = 22; total = ~116. Previous runs also counted default-engine workflows (no engine declaration) bringing totals to ~110-116. Some feature counts (bare, model, mcp-scripts) are best-effort grep-based and may vary ±2.

Generated by Copilot CLI Deep Research Agent (Run: §25243992723)

Generated by Copilot CLI Deep Research Agent · ● 4.9M · ◷

expires on May 3, 2026, 4:56 AM UTC

2026-05-03T04:59:15Z

github-actions[bot]
Bot May 3, 2026
Author

This discussion has been marked as outdated by Copilot CLI Deep Research Agent.

A newer discussion is available at Discussion #29874.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[copilot-cli-research] Copilot CLI Deep Research - 2026-05-02 #29682

Uh oh!

{{title}}

Uh oh!

1️⃣ Current State Analysis

Copilot CLI Capabilities Inventory

Usage Statistics (Current Run)

2️⃣ Feature Usage Matrix

3️⃣ Missed Opportunities

🔴 High Priority

Opportunity 1: MCP Tool & Startup Timeouts

Opportunity 2: `max-continuations` for Complex Long Tasks

🟡 Medium Priority

Opportunity 3: Deploy Unused Custom Agent Files

Opportunity 4: Missing `strict: true` on 37 Workflows

Opportunity 5: `features.copilot-requests: true` for All Copilot Workflows

Opportunity 6: GitHub Toolset Scoping Improvements

🟢 Low Priority

Opportunity 7: Engine Version Pinning for Critical Workflows

Opportunity 8: Custom Harness Script (`engine.harness`)

Opportunity 9: Two Missing Timeouts

4️⃣ Specific Workflow Recommendations

`architecture-guardian.md`

`copilot-opt.md`

`security-review.md`

`daily-code-metrics.md`

5️⃣ Trends & Insights

6️⃣ Best Practice Guidelines

📚 References

Research Methodology

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[copilot-cli-research] Copilot CLI Deep Research - 2026-05-02 #29682

Uh oh!

github-actions[bot] Bot May 2, 2026

📊 Executive Summary

Critical Findings

🔴 High Priority Issues

🟡 Medium Priority Opportunities

1️⃣ Current State Analysis

Copilot CLI Capabilities Inventory

Usage Statistics (Current Run)

2️⃣ Feature Usage Matrix

3️⃣ Missed Opportunities

🔴 High Priority

Opportunity 1: MCP Tool & Startup Timeouts

Opportunity 2: max-continuations for Complex Long Tasks

🟡 Medium Priority

Opportunity 3: Deploy Unused Custom Agent Files

Opportunity 4: Missing strict: true on 37 Workflows

Opportunity 5: features.copilot-requests: true for All Copilot Workflows

Opportunity 6: GitHub Toolset Scoping Improvements

🟢 Low Priority

Opportunity 7: Engine Version Pinning for Critical Workflows

Opportunity 8: Custom Harness Script (engine.harness)

Opportunity 9: Two Missing Timeouts

4️⃣ Specific Workflow Recommendations

architecture-guardian.md

copilot-opt.md

security-review.md

daily-code-metrics.md

5️⃣ Trends & Insights

6️⃣ Best Practice Guidelines

7️⃣ Action Items

📚 References

Research Methodology

Replies: 1 comment

Uh oh!

github-actions[bot] Bot May 3, 2026 Author

github-actions[bot]
Bot May 2, 2026

Opportunity 2: `max-continuations` for Complex Long Tasks

Opportunity 4: Missing `strict: true` on 37 Workflows

Opportunity 5: `features.copilot-requests: true` for All Copilot Workflows

Opportunity 8: Custom Harness Script (`engine.harness`)

`architecture-guardian.md`

`copilot-opt.md`

`security-review.md`

`daily-code-metrics.md`

github-actions[bot]
Bot May 3, 2026
Author