Part 2 — Claude Code, in depth
How do I actually configure and control Claude Code in a real org?
12 min · Updated June 2026
You understand what the harness is. Now you need to build the Claude Code side of it. This section walks through every major configuration surface — from the simplest instruction file to enterprise policy enforcement and audit pipelines.
Every Claude Code template from this page in one download. Mid-2026 snapshot — copy and edit, don't run as-is.
Q2.1 — How should I structure CLAUDE.md, and how big should it be?
CLAUDE.mdis Claude Code’s instruction memory. It’s hierarchical — Claude reads files from the enterprise/global level down to the current directory — and it supports @import syntax to pull in other files.
Keep it lean: roughly 200–300 lines, ideally under 200.This isn’t a style preference; frontier models reliably follow on the order of 150–200 instructions, and Claude Code’s own system prompt already consumes about 50 of those. Every line you add competes for that budget.
What belongs in it:
- A one-paragraph project overview
- Tech stack and versions
- Build, test, and run commands (the things Claude can't guess)
- Conventions Claude wouldn't infer from the code
- Project structure / where things live
What does not belong: anything your linter or formatter already enforces, and anything situational (that goes in a skill).
# Acme Payments Service
Backend service handling card authorization and settlement.
Part of the Acme polyrepo platform — see @AGENTS.md for org-wide conventions.
## Stack
- Go 1.23, PostgreSQL 16, gRPC, deployed to EKS
## Commands
- Build: `make build`
- Test: `make test` (must pass before any commit)
- Lint: `make lint` (CI-enforced; do not restate rules here)
- Migrate:`make db-migrate`
## Conventions Claude can't infer
- All money is int64 minor units. Never use floats for currency.
- Every external call goes through `internal/clients/` — never inline an HTTP client.
- New endpoints require an entry in `docs/api-changelog.md`.
## Where things live
- Business logic: `internal/domain/`
- HTTP/gRPC handlers: `internal/transport/`
- DB access: `internal/store/`Note the @AGENTS.md import — this is how you make CLAUDE.md a thin Claude-specific layer over a shared, vendor-neutral file (covered in Part 5).
Golden template with the ≤200-line budget annotated. Every placeholder labeled — replace them; don't add more lines than you remove.
Q2.2 — What are subagents and how do I build one?
Subagents are specialized personas Claude Code can delegate to. Each runs with its own scoped tools, its own model, and its own system prompt — and crucially, its own context window. They live in .claude/agents/*.md as Markdown files with YAML frontmatter.
The 2026 shift is that subagents got a real runtime. Two patterns matter:
- Pipelines — instead of one mega-prompt, you chain agents:
pm-spec→architect-review→implementer-tester. Each stage has a narrow job. - Context forking — a noisy agent (say, one running a flaky test loop) works in a forked context and returns only a clean summary to the orchestrator. The orchestrator never sees the 4,000 lines of test output.
---
name: security-reviewer
description: Reviews diffs for security issues — injection, authz gaps,
secret exposure, unsafe deserialization. Invoke after any change to
auth, payment, or data-access code.
tools: [Read, Grep, Glob]
disallowedTools: [Bash, Edit, Write]
model: sonnet
skills: [secure-coding-checklist, pii-handling]
---
You are a security reviewer. You never modify code — you only read and report.
For each finding, output: severity, file:line, the risk, and a concrete fix.
Pull the OWASP-mapped checklist from the `secure-coding-checklist` skill.
If you find a hard-coded secret, mark it CRITICAL and stop.Things worth knowing: tools is an allowlist and disallowedTools a denylist; model can be haiku/sonnet/opus/inherit; skills preloads skills; effort tunes how hard it works. You can force one model across all subagents — for cost or compliance ceilings — with the CLAUDE_CODE_SUBAGENT_MODEL environment variable. Be aware: subagent-heavy workflows can consume around 7x the tokens of a single-threaded session.
Five-stage starter pack. The guide shows one; these are the full set — security reviewer, code reviewer, architect, implementer-tester, docs writer. Each has a narrow job and minimal tools.
Q2.3 — Skills vs CLAUDE.md vs subagents — when do I use which?
This trips everyone up. The clean distinction:
CLAUDE.md= always-on facts about this project. Loaded every session. Keep it small.- Skill= a task playbook loaded on demand. “How we do database migrations,” “our PII handling rules,” “the incident runbook.” Discovered via its description, activated when relevant.
- Subagent= a worker with a role. It’s an actor, not a document. A subagent often uses skills.
---
name: regulatory-logging
description: Rules for audit logging in payment flows. Use whenever code
creates, modifies, or deletes a financial transaction, or touches the
audit_events table.
---
# Regulatory Logging
Every state change to a transaction MUST emit an audit event.
## Required fields
- actor_id, actor_type (human|service|agent)
- before_state, after_state (redacted per references/redaction-rules.md)
- correlation_id, timestamp (UTC, RFC3339)
## How to emit
Call `audit.Emit(ctx, event)` — never write to `audit_events` directly.
## Retention
7 years. Never write code that deletes from `audit_events`.The descriptionis doing real work — it’s how the agent decides whether to load the skill, so write it with explicit trigger conditions (“use whenever...”). Skills follow an open standard (agentskills.io) adopted by 26+ platforms, which is exactly why the same SKILL.md works on Copilot too.
Regulatory-logging skill template showing the three-tier progressive-disclosure structure — SKILL.md trigger, references/, scripts/, and assets/ for bulky material that costs zero tokens until opened.
Q2.4 — What are plugins and marketplaces, and do I need my own?
A plugin bundles skills, slash commands, subagents, hooks, and MCP server definitions into one installable unit, described by .claude-plugin/plugin.json. A marketplace is a catalog of plugins, described by .claude-plugin/marketplace.json, that you can host on GitHub, GitLab, npm, or locally.
For an enterprise: yes, you want a private marketplace. It becomes your internal app store for AI capabilities. The pattern is a single platform repo holding a marketplace manifest that references your internal plugins. For plugins under active development you can omit versionto track the latest commit; for stability, pin to a commit SHA — which is exactly what Anthropic’s official marketplace does after automated validation.
You lock developers to your marketplace through managed settings (next question).
Q2.5 — How do hooks work, and how do they enforce reviews and standards?
Hooks are the control layer — the difference between rules that are advisory and rules that are enforced. Without hooks, CLAUDE.mdrules are suggestions; with hooks, they’re gates.
A hook is a handler that fires on a lifecycle event. By mid-2026 Claude Code exposed 21 events and 4 handler types:
- Events include
PreToolUse,PostToolUse,SessionStart,SessionEnd,UserPromptSubmit,Stop,SubagentStart,SubagentStop,PreCompact,PermissionRequest, and more. - Handler types are
command(run a shell script),http(POST to a remote validator or audit endpoint),prompt(a single-turn LLM evaluation), andagent(spawn a subagent for deep verification).
The key behavioral fact: PreToolUse is the only event that can block — if its handler exits with code 2, the tool call is denied. SessionStart, UserPromptSubmit, and PreCompact can inject context.
A practical configuration doing three jobs — blocking edits to protected paths, linting after every edit, and shipping an audit trail:
{
"hooks": {
"PreToolUse": [
{
"matcher": "Edit|Write",
"hooks": [
{ "type": "command",
"command": ".acme/hooks/protect-paths.sh",
"description": "Exit 2 if the target path is in /infra or /compliance" }
]
}
],
"PostToolUse": [
{
"matcher": "Edit|Write",
"hooks": [
{ "type": "command", "command": "make lint typecheck" }
]
}
],
"SessionStart": [
{ "hooks": [
{ "type": "http",
"url": "https://audit.acme.internal/claude/session",
"headers": { "Authorization": "Bearer ${ACME_AUDIT_TOKEN}" } }
]}
]
}
}This is also how you wire mandatory review: a PreToolUse agent hook can spawn your security-reviewer subagent before a sensitive edit is allowed. Hooks apply recursively into subagents via SubagentStart/SubagentStop. Well-tuned hooks run in well under 200ms each.
Q2.6 — How does managed-settings.json give me real enterprise control?
managed-settings.json is the policy plane. Unlike user or project settings, it cannot be overridden by the developer. You distribute it through MDM: a macOS plist (com.anthropic.claudecode), Windows registry (HKLM\SOFTWARE\Policies\ClaudeCode), or /etc/claude-code/managed-settings.jsonon Linux. There’s also a managed-settings.d/ drop-in directory so different teams (Security, Platform, FinOps) can own separate policy fragments.
{
"permissions": {
"deny": [
"Read(**/.env)", "Read(**/secrets/**)",
"Bash(sudo:*)", "Bash(curl:*)"
]
},
"disableBypassPermissionsMode": "disable",
"allowManagedMcpServersOnly": true,
"allowManagedPermissionRulesOnly": true,
"allowedMcpServers": ["acme-code-intel", "jira", "datadog"],
"strictKnownMarketplaces": [
{ "hostPattern": "^github\.acme\.com$" }
],
"env": {
"ANTHROPIC_DEFAULT_OPUS_MODEL": "<pinned-bedrock-arn>",
"ANTHROPIC_DEFAULT_SONNET_MODEL": "<pinned-bedrock-arn>"
}
}One subtle but critical merge rule: for ordinary scalar fields it’s first-source-wins, but array fields (permissions.allow[], hooks, enabledMcpjsonServers) are concatenated and de-duplicated across all layers. That’s what makes layered governance work — Security’s deny rules and a team’s allow rules combine rather than overwrite each other.
This deployed file is, in practice, the artifact a SOC 2 or ISO 27001 auditor wants to see for AI access-control evidence.
Policy template with the deny-list, model-pinning, and marketplace-lock fields. The array-merge rule for allow/deny/hooks arrays is documented inline — that's what makes layered governance compose correctly.
Q2.7 — How does sandboxing work, and can I trust it?
Claude Code’s sandbox is OS-level: macOS Seatbelt and Linux bubblewrap (WSL2 works; native Windows doesn’t). Filesystem writes are blocked at the syscall level, and network traffic is funneled through a localhost proxy with a domain allowlist. In Anthropic’s own usage it cut permission prompts by roughly 84%.
But for a regulated org, treat it as defense-in-depth, not a hard boundary. Two reasons. First, allowUnsandboxedCommands defaults to true, meaning a command that fails inside the sandbox can be retried under normal permissions. Second, there are documented demonstrations of agents reasoning around path-based denylists via indirect paths.
So: enable /sandbox for everyday use, set sandbox.failIfUnavailable: true if you want hard-gating, and for genuinely high-risk operations layer external isolation — a Docker microVM, gVisor, or kernel-level controls. Never use --dangerously-skip-permissions outside a disposable container.
Q2.8 — We're hybrid cloud. How do I deploy Claude Code across AWS, GCP, and Azure?
Claude Code can target multiple inference backends via environment variables:
- AWS:
CLAUDE_CODE_USE_BEDROCK=1plusAWS_REGION - GCP:
CLAUDE_CODE_USE_VERTEX=1plusANTHROPIC_VERTEX_PROJECT_ID - Azure: Microsoft Foundry
Three rules for a regulated hybrid setup:
- 1.Always pin model IDs. Use ANTHROPIC_DEFAULT_OPUS_MODEL, ANTHROPIC_DEFAULT_SONNET_MODEL, ANTHROPIC_DEFAULT_HAIKU_MODEL set to specific Bedrock inference-profile ARNs / Vertex version names / Foundry deployment names. Unpinned, Bedrock and Vertex may silently fall back on updates; Foundry may hard-error.
- 2.Use geo-restricted profiles for data residency. Bedrock's geo-restricted CRIS profiles give you EU/US/APAC boundaries.
- 3.For on-prem or air-gapped, route through an internal LLM gateway (e.g., LiteLLM) or a self-hosted Anthropic-compatible proxy inside your VPC. A well-tuned gateway adds only single-digit milliseconds of latency — but expect to lose some launch-day features in fully sovereign setups.
A real-world reference: Q2 Holdings’ “Q2 Code” (April 2026) delivered Claude Code via Amazon Bedrock as a governed development environment for credit unions. It’s a vendor announcement rather than an independent retrospective, but it confirms the Bedrock-fronted pattern is viable under financial-services controls.
Q2.9 — How do I audit what Claude Code did?
There are two separate pipelines, and you need both.
1. OpenTelemetry — this is your Claude Code audit source. Set CLAUDE_CODE_ENABLE_TELEMETRY=1, OTEL_METRICS_EXPORTER=otlp, OTEL_LOGS_EXPORTER=otlp, and point OTEL_EXPORTER_OTLP_ENDPOINT at a collector that forwards to your SIEM. You get metrics covering token usage, cost, pull request counts, and code edit decisions — plus log events for user prompts, tool decisions, MCP server connections, plugin loads, skill activations, hook executions, and permission mode changes. A 2026 beta adds distributed-tracing spans via CLAUDE_CODE_ENHANCED_TELEMETRY_BETA=1.
2. The Compliance API — covers the Claude Platform activity feed (identity, configuration, chat activity) via GET /v1/compliance/activities, with 180-day retention on Anthropic’s side. Schedule a daily pull into your own warehouse for long-term retention.
The critical gotcha: the Compliance API does not cover Claude Code prompt/tool-use content. If you only wire the Compliance API, you have an audit blind spot over your developers’ actual coding sessions. OpenTelemetry is the answer for that. Start your OTel pipeline before you roll out broadly — you want baseline data before you need it for an incident.