QRefAI
Contents
AI Coding

Part 7 — Traps, caveats, and things that will bite you

What will go wrong if I am not careful?

5 min · Updated June 2026

This is the section to read before you go broad with a rollout, and to return to when something goes wrong. It’s a consolidated list of the failure modes from the underlying research, the cases where the blueprint in this article doesn’t apply, and guidance on how to weight the sources behind all of this.

Q7.1 — What will go wrong if I'm not careful?

Nine concrete failure modes, in roughly the order teams hit them:

Trap 1

Treating the sandbox as a security boundary

It isn’t. allowUnsandboxedCommands defaults to true, meaning a command that fails inside the sandbox can be retried under normal permissions. There are also documented demonstrations of agents reasoning around path-based denylists via indirect paths (e.g., reaching a binary through /proc/self/root/usr/bin/npx). Layer external isolation for high-risk work.

Trap 2

Wiring only the Compliance API for audit

It doesn’t cover Claude Code prompt/tool-use content. If you only wire the Compliance API, you have an audit blind spot over your developers’ actual coding sessions. You need OpenTelemetry for that — they’re two separate pipelines and you need both.

Trap 3

Forgetting to pin model versions

Unpinned, Bedrock and Vertex can silently swap models on update; Foundry can hard-error. Pin every model ID in managed-settings.json before you roll out.

Trap 4

Bloated instruction files

The ETH Zurich study (arXiv:2602.11988) found context files can reduce task success rates and add 20%+ cost. Keep AGENTS.md/CLAUDE.mdlean; push detail into skills. If you’re surprised when the agent ignores your 300-line instruction file, this is why.

Trap 5

Assuming AGENTS.md is read by Claude Code directly

As of Q2 2026 it isn’t — bridge it with @import or a symlink in CLAUDE.md.

Trap 6

Trusting MCP registry enforcement as a hard control

GitHub’s own guidance says it matches on server name, can be bypassed, and doesn’t apply to the cloud agent. Treat it as a governance signal, not a security boundary. Use a real gateway.

Trap 7

Underestimating token cost

Subagent-heavy workflows can run roughly 7x a single-threaded session. Bedrock’s default Opus rate limit (around 25 RPM) needs raising before team rollout. Stand up cost dashboards early — before the first surprised finance conversation, not after.

Trap 8

Depending on preview features

GitHub Agentic Workflows was in technical preview as of mid-2026; the MCP enterprise allowlist remained in preview at the broader AI Controls GA; several Copilot agent-customization features are in public preview. Don’t make any of them load-bearing without a fallback.

Trap 9

Skipping the supply-chain basics

CVE-2025-59536 demonstrated that a malicious MCP config in a cloned repo could execute commands before the trust dialog appeared. The Cline and Shai-Hulud incidents showed what happens with autonomous agents that have write access and no defense against prompt injection. The controls: pin plugins by SHA, allowlist MCP servers, and default autonomous work to read-only.

Q7.2 — When does this whole blueprint not apply?

The harness design in this article is built for a specific context: multi-vendor (Claude Code + Copilot), polyrepo, regulated or otherwise compliance-sensitive, hybrid cloud. If your situation is materially different, some of it is overhead.

  • Single-cloud, single-IDE shops. If you’re 100% VS Code on Azure, the dual-vendor harness is overhead. Pick one vendor and reinvest the saved effort into deeper skills and MCP coverage.
  • If gh-aw reaches GA with a multi-tenant MCP gateway and solid cross-repo orchestration, shift more autonomous repo work from Claude hooks to Agentic Workflows. The hooks-vs-workflows boundary will move.
  • If Microsoft ships a first-party managed-settings equivalent for Copilot agent hooks, simplify the policy layer accordingly.
  • If you go fully air-gapped or sovereign, drop the direct Anthropic API, front Bedrock via PrivateLink or a self-hosted Anthropic-compatible proxy through an LLM gateway, and accept losing some launch-day features.

The blueprint is a starting point, not a specification. The right thing to do is understand why each decision was made and adjust it to your actual constraints.

Q7.3 — How should I treat the sources behind this?

With appropriate skepticism by category.

Vendor documentation (Anthropic, GitHub)

Authoritative for how features work but moves fast — re-verify before relying on a specific flag or behavior. What was in preview when this was written may be GA now, or may have changed.

Vendor announcements like Q2 Holdings’ “Q2 Code”

Confirm a pattern is viable but are not independent retrospectives with measured outcomes. They tell you the approach works; they don't tell you how hard it was or what broke.

The ETH Zurich paper (arXiv:2602.11988)

The strongest empirical source in this field and is worth weighting heavily on the “keep context lean” point. It’s the only piece of evidence here with a controlled methodology.

Practitioner blogs (HumanLayer, Pixelmojo, PubNub, Thoughtworks Radar)

Excellent for pattern synthesis but are not certified guidance — validate anything compliance-affecting with your own legal and security teams and the vendors’ Trust Portal artifacts.

And one structural reminder: the whole field is moving monthly. Treat this article as a snapshot of the mid-2026 consensus and the harness itself as a living product — because the moment the models or the platforms change, the harness has to change with them. That’s not a caveat. That’s the job.