# 手与工具：工具注册、调度与执行运行时

工具是 AI 与真实世界的唯一接口——40 个内置工具目录从注册、权限检查到并发调度，构成了 Claude Code 的"系统调用表"。本章解析工具的完整生命周期，从 `Tool.ts` 的类型定义到流式并行执行引擎。

> **📏 关于"工具数"的计数规则**
>
> 本书全书统一使用 **"40 个内置工具目录"** 作为工具数的权威口径。这里解释一下为什么是这个数字：
>
> - **43 = 目录原始条目数**：`ls src/tools/` 返回 43 条，但其中有 3 条是**非工具**——`shared/`（共享代码）、`testing/`（测试基础设施）、`utils.ts`（工具函数），把它们当工具算是分类错误
> - **40 = 实际工具目录数**：43 - 3 非工具 = 40 个真正的工具目录（AgentTool、BashTool、FileReadTool、GlobTool 等）
> - **24 = tools.ts 顶层 require 数**：`tools.ts` 的顶层作用域只显式 require 了 24 个工具，其余工具通过 feature gate（如 `SuggestBackgroundPRTool`、`MonitorTool`）或动态装配（MCP 工具、ToolSearch 延迟加载）被引入，运行时实际装载的工具数**随配置动态变化**
>
> **为什么选 40 而不是 43 或 24？** 40 是**稳定可验证的事实**（ls 的目录数）且排除了分类错误；43 包含非工具条目，数字虚高；24 只反映静态 require，低估了 feature-gated 工具。因此 **40 是最能代表"Claude Code 作为一个产品提供多少个内置工具"的数字**。
>
> 运行时实际的工具数会因 feature gate 和 MCP 服务器而浮动——在某个特定会话中 Claude 可能只看到 30 个工具（因为 MonitorTool/WorkflowTool/CronTool 未启用），也可能看到 50+ 个（因为接入了 MCP 服务器）。但作为**产品描述**，"40 个内置工具目录"是最稳妥的说法。

> **源码位置**：`src/tools/`（40 个内置工具目录实现）、`src/Tool.ts`（工具基类接口）、`src/tools.ts`（工具注册表与装配链入口）、`src/services/tools/`（执行核心：`toolOrchestration.ts` 调度壳、`toolExecution.ts` 执行核心、`StreamingToolExecutor.ts` 流式调度器、`toolHooks.ts` hooks 策略层）、`src/utils/toolResultStorage.ts`（大结果持久化）

> **🌍 行业背景**：工具调用（Tool Use / Function Calling）是 AI Agent 领域的**标准范式**，不是 Claude Code 的独创。OpenAI 在 2023 年 6 月推出 Function Calling，Google Gemini 有 Tool Use API，LangChain/LlamaIndex 等框架都内置了工具调用抽象。真正的差异化不在"AI 能调用工具"这个概念本身，而在于 Claude Code 在三个维度上的工程深度：**(1)** 40 个内置工具目录的精细分类与行为设计（Edit/Write 分离、并发安全声明）；**(2)** 十步权限检查链的粒度远超同类产品（对比 Cursor 的二元允许/拒绝）；**(3)** MCP 协议让工具集可以动态扩展，而不是编译时固定。本章重点分析这三个工程决策的设计逻辑。

> 🌍 社区视角 | @wquguru — "工具中的名词相同，不代表系统的骨架相同。"

这句话精确地点出了理解 Claude Code 工具体系的关键：LangChain 有"Tool"，OpenAI 有"Function"，Cursor 有"Action"——名词看似相同，但背后的注册机制、权限模型、并发调度和生命周期管理完全不同。把 Claude Code 的 `Tool<Input, Output>` 等同于其他框架的工具概念，就像把操作系统的系统调用等同于一个普通函数调用一样——表面形似，骨架迥异。

---

## 引子：系统调用——用户态程序访问硬件的唯一方式

在操作系统中，用户态程序不能直接读写磁盘、不能直接发送网络包、不能直接操控显卡。它必须通过**系统调用**（syscall）向内核发出请求："请帮我读 `/etc/hosts` 这个文件。" 内核验证权限、执行操作、把结果返回给程序。

Claude Code 中的 AI 也是这样。Claude（模型）不能直接读写你的文件——它运行在远端服务器上，根本触不到你的电脑。它能做的只有一件事：在回复中声明"我要调用某个工具"。然后**本地系统**来执行这个工具，把结果塞回对话历史。

工具，就是 AI 的系统调用。

> **🔑 OS 类比：** Tool = 服务窗口。AI 是来办事的市民，queryLoop 是政务大厅，工具是各个服务窗口（读文件窗口、写文件窗口、运行命令窗口）。`Tool.ts` 就是这个大厅的窗口目录。
>
> 💡 **通俗理解**：工具就像**员工的工作技能证书**——读文件证书（Read）、写代码证书（Edit）、搜索证书（Grep）、执行命令证书（Bash）。Claude 想做任何事，都必须先"亮证"，然后经过考核（权限检查），通过了才能动手。没有证书的事情，Claude 根本做不了。

---

## 1. 工具的"户口本"：Tool 接口

每个工具都必须实现 `Tool<Input, Output>` 泛型接口（`Tool.ts`）。这是 Claude Code 中**被依赖最多的抽象**——40 个内置工具目录、MCP 动态工具、Plugin 工具，全部实现这个接口。

接口的关键字段：

```typescript
interface Tool<Input, Output> {
  // ── 身份 ──
  name: string                    // 工具名（模型通过此名调用）
  aliases?: string[]              // 别名（兼容重命名）

  // ── 能力描述 ──
  description(input, opts): Promise<string>  // 给模型看的动态描述
  inputSchema: z.ZodType          // 输入参数的 Zod schema

  // ── 执行 ──
  call(args, ctx, canUseTool, parent, onProgress): AsyncGenerator<Progress, Result>

  // ── 权限 ──
  checkPermissions(input, ctx): Promise<PermissionResult>
  isReadOnly(input): boolean      // 只读工具不需要写权限

  // ── 并发 ──
  isConcurrencySafe(input): boolean  // 是否可并行执行

  // ── 结果处理 ──
  // 注意：源码 Tool.ts 中**没有** renderResultForAssistant 这个字段。
  // 把 ToolResult<Output> 转成 ToolResultBlockParam 的逻辑在 Tool.ts
  // 的独立函数 mapToolResultToToolResultBlockParam()（line 557）里，
  // 不是 Tool 对象的方法。本书早期把它标为 Tool 方法是口径错误。
  maxResultSizeChars: number      // 结果大小上限

  // ── UI ──
  renderToolUseMessage(input, output): ReactNode  // 终端中的工具调用展示

  // ── 元数据 ──
  shouldDefer?: boolean           // 延迟加载（schema 按需获取 · 源码字段名是 shouldDefer · line 442，非 isDeferred）
  group?: string                  // 工具分组
}
```

> **⚠ 本节历史修正**（2026-04-22 夜间审阅沉淀）：
> - `call()` 返回 `Promise<ToolResult<Output>>` 而非 `AsyncGenerator`（line 379 核实）· AsyncGenerator 是 `runToolUse()` 执行器层的接口，不是 Tool 本身
> - `renderResultForAssistant` 在源码中**不存在**，实际 API 是 `mapToolResultToToolResultBlockParam()`（line 557 独立函数）· 本节正文 line 235/259/403/702 的历史提法应按此理解
> - `isDeferred` 源码字段名是 `shouldDefer`（line 442）

**20+ 个方法和属性**——这不是一个简单的"函数接口"，而是一个完整的**工具生命周期合约**。从"告诉 AI 我能做什么"到"检查我是否被允许做"到"做完了怎么展示结果"，全部在接口中定义。

> 📚 **课程关联**：`Tool<Input, Output>` 的 `inputSchema`（Zod schema）本质上是一种**接口描述语言（IDL）**，和 gRPC 的 `.proto` 文件、OpenAPI 的 JSON Schema 扮演同样的角色——告诉调用方"你应该传什么格式的参数"。区别是 `.proto` 面向人类开发者，而这里的 schema 面向 AI 模型。这个设计是行业标准做法——OpenAI 的 Function Calling 也用 JSON Schema 描述参数，Claude Code 只是把验证层换成了 Zod（类型更安全，错误信息更友好）。

### 为什么 description 是动态的？

`description()` 不是一个静态字符串，而是一个**异步函数**。因为不同上下文下，同一个工具的描述可能不同。例如：
- `BashTool` 的描述包含当前操作系统信息（macOS vs Linux 的命令不同）
- 沙箱模式下的工具描述会附加安全限制说明
- 某些工具在不同权限模式下有不同的能力描述

每个工具描述都会占用 system prompt 的 token 预算——这是为什么 `ToolSearchTool`（工具搜索元工具）存在的原因：不是所有工具描述都需要放在 system prompt 里，AI 可以先搜索再调用。动态 description 本身是合理的工程选择（OpenAI 的 function calling 也支持动态修改工具描述），Claude Code 的亮点在于**把 token 预算管理做到了工具级别**——延迟加载 + ToolSearch 的组合是同类产品中少见的。

---

## 2. 四大类工具（代表性列举，非全集）

40 个内置工具目录可以按功能**粗粒度**分为四大类。下方每类表格只列出该类中的**代表性工具**——文件操作 6 个、执行引擎 2 个、Agent 家族约 10 个、扩展与辅助十余个——合计约 30 个典型入口，不是全量 40 个目录的穷举；其余工具（如 `ConfigTool`、`LSPTool`、`McpAuthTool` 等）受 feature gate / 场景条件触发，不常见于主流程，故本章不一一展开，完整清单见 Part 3「工具注册表」的全量索引。

### 2.1 文件操作（AI 的"双手"）

| 工具 | 功能 | 只读？ | 并发安全？ |
|------|------|--------|-----------|
| `Read` | 读取文件内容 | ✅ | ✅ |
| `Edit` | 差分替换编辑 | ❌ | ❌ |
| `Write` | 完整覆盖写入 | ❌ | ❌ |
| `Glob` | 按文件名模式搜索 | ✅ | ✅ |
| `Grep` | 按内容搜索 | ✅ | ✅ |
| `NotebookEdit` | Jupyter 笔记本编辑 | ❌ | ❌ |

**设计考量**：`Edit` 和 `Write` 分离是刻意的。`Edit` 只修改文件中的一部分（差分替换），出错时影响范围小；`Write` 覆盖整个文件，用于创建新文件或完全重写。系统提示词中明确告诉 AI 优先使用 `Edit`。

### 2.2 执行引擎（AI 的"跑腿"）

| 工具 | 功能 | 沙箱？ |
|------|------|--------|
| `Bash` | 执行 Shell 命令 | 按配置 |
| `PowerShell` | Windows 命令 | 按配置 |

`Bash` 是**使用频率最高**的工具。它也是安全风险最大的——一条 `rm -rf /` 就能造成灾难。因此 Bash 工具有最复杂的权限检查逻辑、最严格的沙箱限制、和最细粒度的命令模式匹配（`Bash(git *)` 允许所有 git 命令但不允许其他）。

### 2.3 Agent 家族（AI 的"分身"）

| 工具 | 功能 |
|------|------|
| `Agent` | 创建子 Agent（新的 AI 实例）|
| `SendMessage` | 向已有 Agent 发消息 |
| `TeamCreate` | 创建 Teammate（Swarm 模式）|
| `TeamDelete` | 删除 Teammate |
| `TaskCreate/Get/List/Update/Stop` | 任务管理五件套 |

Agent 工具是"工具中的工具"——调用它会创建一个**完整的新 queryLoop 实例**，拥有自己的消息历史、权限上下文和工具集。这就像操作系统中的 `fork()` 系统调用——创建一个子进程。

### 2.4 扩展与辅助

| 工具 | 功能 |
|------|------|
| `WebFetch` | 抓取网页内容 |
| `WebSearch` | 网络搜索 |
| `MCPTool` | 调用 MCP 服务器工具 |
| `SkillTool` | 调用 Skill |
| `ToolSearch` | 搜索可用工具（元工具）|
| `AskUserQuestion` | 向用户提问 |
| `Sleep` | 等待指定时间 |
| `Brief` | 简短输出 |
| `EnterPlanMode/ExitPlanMode` | 进入/退出计划模式 |
| `EnterWorktree/ExitWorktree` | 进入/退出 Git worktree |
| `ReadMcpResource` | 读取 MCP 资源 |
| `ListMcpResources` | 列出 MCP 资源 |
| `RemoteTrigger` | 远程触发 |
| `ScheduleCron` | 定时任务（详见 Part 3「Cron 调度系统完全解析」） |

---

## 3. 工具注册表：从散落的文件到统一的清单

`tools.ts`（注意是复数）是工具注册表。`getTools()` 函数负责**对内置工具做上下文过滤**（deny rules、REPL vs simple mode、平台限制等），返回一个**只包含内置工具**的筛选后列表。MCP 工具的合并发生在下游的 `assembleToolPool()` 里（见下文"工具装配四层链"）——`getTools()` 本身**不合并 MCP 工具**。

注册不是简单的"列一个数组"——它涉及多层条件过滤：

```
getTools() 的逻辑（仅内置工具）：
  1. 调用 getAllBaseTools() 收集所有内置工具（40 个目录）
  2. 应用 feature gate 过滤（实验性工具只在开启时可用）
  3. 应用权限过滤（被 deny 规则禁用的工具不出现）
  4. 应用平台过滤（PowerShell 只在 Windows 上）
  5. 应用模式过滤（REPL vs simple mode 暴露的工具集不同）
  6. 返回内置工具子集

assembleToolPool() 的逻辑（最终装配）：
  7. 合并 getTools() 的内置工具 + 已连接的 MCP 工具
  8. 排序（内建连续前缀以保持 prompt cache 稳定）
  9. 去重（uniqBy，内建优先处理 alias 冲突）
```

**关键设计**：工具列表是**动态的**。同一个 session 中，如果你连接了新的 MCP 服务器，工具列表会增长。如果企业策略禁用了某个工具，它从列表中消失。AI 看到的是一个**随时可能变化的能力清单**。

### 工具装配四层链

实际上，工具从"散落的文件"到"模型可见的清单"要经过**四层装配**（源码分别位于 `Tool.ts`、`tools.ts`、`screens/REPL.tsx`）：

```
buildTool() + TOOL_DEFAULTS          ← 第一层：统一契约基线
   把半成品工具定义补成完整 Tool 对象（Tool.ts:783）
       ↓
getAllBaseTools()                     ← 第二层：内建能力真值表
   带条件分支的全量注册（feature gate、环境变量、
   用户类型、worktree/swarm/LSP 等开关）（tools.ts:193）
       ↓
getTools(permissionContext)           ← 第三层：当前上下文过滤
   deny rules、REPL vs simple mode、平台限制（tools.ts:271）
       ↓
assembleToolPool()                   ← 第四层：最终装配
   合并内建 + MCP 工具，排序（内建连续前缀以保持
   prompt cache 稳定性），去重（uniqBy，内建优先）（tools.ts:345）
```

> 💡 **通俗理解**：这就像组建一支球队——第一层是"把每个球员的身份证照片、体检报告、技能评级补全"（buildTool），第二层是"列出所有候选球员"（getAllBaseTools），第三层是"根据今天的战术和对手淘汰不上场的"（getTools），第四层是"把首发和替补排好座位、发球衣"（assembleToolPool）。

`REPL.tsx` 中的 `getToolUseContext()` 是这条装配链进入 query 主循环的桥接点——它把最终的工具池连同权限上下文、MCP 客户端等一起打包成 `ToolUseContext`，交给 `queryLoop()` 使用。

> 📚 **课程关联**：`getTools()` 是经典的**注册表模式（Registry Pattern）**——所有工具注册到一个中央清单，消费者不需要知道每个工具从哪来，只需查注册表。这和操作系统的**系统调用表（syscall table）**是同一个模式：Linux 内核维护一张 `sys_call_table[]`，用户态程序通过编号调用，不需要知道每个 syscall 在内核中的实现位置。注册表模式本身是成熟的工程实践；Claude Code 的独特之处在于注册表是**运行时动态变化**的（MCP 工具热插拔），而不是编译时固定的静态表。

---

## 4. 工具执行的完整管线

一次工具调用从 AI 的"请求"到最终"结果返回"的完整路径：

```
AI 输出 tool_use block（JSON）
  │
  ├── 1. 输入验证
  │   └── Zod schema 验证输入参数
  │       → 失败：返回错误消息给 AI
  │
  ├── 2. 权限检查（canUseTool）
  │   ├── bypass-immune 规则检查
  │   ├── PreToolUse Hooks 执行
  │   ├── 自动批准规则匹配
  │   ├── 沙箱规则检查
  │   └── ... 共十步
  │   → Deny：返回拒绝消息给 AI
  │   → Ask：弹出 UI 确认框
  │
  ├── 3. 文件历史追踪（Edit/Write 专用）
  │   └── fileHistoryTrackEdit()
  │       → 在修改前快照文件原始内容
  │
  ├── 4. 执行
  │   └── tool.call(args, context, ...)
  │       → AsyncGenerator<Progress, Result>
  │       → 中间 yield Progress 事件（更新 UI）
  │       → 最终 return Result
  │
  ├── 5. PostToolUse Hooks
  │   └── 工具执行后的自定义逻辑
  │
  ├── 6. 结果序列化与持久化
  │   └── renderResultForAssistant(result)
  │       → 把结果转成 AI 看得懂的纯文本
  │       → 超过 maxResultSizeChars 时：
  │         ① maybePersistLargeToolResult() 将完整结果写入磁盘
  │           （落盘路径：~/.claude/tool-results/）
  │         ② 对话正文替换为 <persisted-output> 标签包裹的引用型预览
  │           （不是简单截断，而是持久化 + 引用替换）
  │       → 结果为空时：替换为 "(toolName completed with no output)"
  │         （防止模型误判 turn 边界，源码：toolResultStorage.ts:293）
  │
  └── 7. 结果注入
      └── 包装成 UserMessage { tool_result: ... }
          → 追加到消息历史
          → 等待下一次心跳中被发送给 API
```

> 📚 **课程关联**：步骤 2 的十步权限检查是经典的**责任链模式（Chain of Responsibility）**——请求沿着一条"检查链"逐个传递，每个节点可以 Approve、Deny 或"交给下一个节点"。这和 Web 框架中的中间件管线（Express.js 的 `app.use()`、Django 的 middleware stack）是同一个模式。责任链模式本身不新鲜，但 Claude Code 把它用在 AI 权限控制上，并且做到了十层深度——在终端原生 AI 编码工具中，这个权限控制粒度是较高的。

### 关键节点解析

**步骤 3（文件历史追踪）**只在 Edit/Write 工具上执行——这是 `/rewind` 功能的基础。在修改文件之前，系统自动保存一份快照。如果 AI 改坏了代码，用户可以一键回到任意消息之后的文件状态。

> ⚠️ **精确度说明**：虽然上面的管线图为了清晰把"文件历史快照"画成了独立的步骤 3，但在源码实现中，`fileHistoryTrackEdit()` 的调用实际发生在 `tool.call()`（步骤 4）的**内部逻辑**中，而不是调度器层面的一个独立阶段。把它单独列出是为了强调这个机制的存在，但读者应该理解它是工具执行逻辑的一部分，不是调度器管线的独立步骤。

**步骤 6（结果序列化）**的 `renderResultForAssistant()` 是一个常被忽视但极其重要的函数。它决定了 AI "看到"什么。例如：
- `Read` 工具的结果包含行号前缀（`1\t内容`）——这样 AI 后续 Edit 时能精确指定位置
- `Bash` 工具的结果包含 exit code——AI 能判断命令是否成功
- `Glob` 工具的结果按修改时间排序——最近修改的文件排在前面，帮助 AI 优先关注

---

## 5. ToolUseContext：工具执行时的"世界"

每个工具执行时都会收到一个 `ToolUseContext` 对象——这是工具能"看到"的整个世界：

| 字段 | 内容 | 类比 |
|------|------|------|
| `messages` | 当前会话的完整消息历史 | 进程的内存空间 |
| `appState` | 全局应用状态 | 内核数据结构 |
| `permissionContext` | 权限配置 | 进程的 UID/GID |
| `mcpClients` | MCP 客户端列表 | 已加载的设备驱动 |
| `model` | 当前使用的模型名 | CPU 型号 |
| `readFileState` | 文件状态缓存 | 文件描述符表 |
| `abortController` | 取消信号 | SIGTERM |

**设计决策**：ToolUseContext 是**只读传递**的——工具不应该直接修改 context 中的状态。状态变更通过返回 Result 和 side effect（文件修改等）来实现。这和操作系统中"syscall 通过返回值传递结果"的设计一致。

---

## 6. 并发模型：哪些工具可以同时跑

Claude Code 的工具并发策略：

调度器的工作单元是**一个 batch**——即模型单次回复中包含的所有 `tool_use` block。在同一个 batch 内：

- `isConcurrencySafe` 返回 `true` 的工具（如 Read、Glob、Grep）**全部并行执行**；
- `isConcurrencySafe` 返回 `false` 的工具（如 Edit、Write）**逐个串行执行**。

| 场景 | 行为 | 原因 |
|------|------|------|
| 同 batch 内多个 Read + Glob + Grep | 全部并行 | 都声明 `isConcurrencySafe=true` |
| 同 batch 内多个 Edit | 串行 | 都声明 `isConcurrencySafe=false` |
| 同 batch 内 Read + Edit 混合 | Read 并行跑，Edit 串行跑，**两者之间无依赖等待** | 调度器只按静态标志位分流，不做动态依赖分析 |
| Agent 创建 | 并行 | 声明 `isConcurrencySafe=true` |
| Bash 命令 | 取决于沙箱配置 | — |

**重要澄清**：`isConcurrencySafe` 是每个工具**自行声明的静态布尔标志**，不是调度器通过分析工具间的数据依赖动态计算出来的。系统中**没有 DAG 依赖图**，也没有"Edit 等待 Read 完成"这样的动态依赖追踪。调度器的逻辑很简单：同一 batch 内，标志为 true 的并行跑，标志为 false 的串行跑，仅此而已。系统**信任**工具的声明——如果一个工具错误地声明自己是并发安全的，可能会出现竞态条件。

> 📚 **课程关联**：`isConcurrencySafe` 的设计理念类似数据库的**事务隔离级别**——不同操作有不同的并发安全需求。Read（只读）相当于 `READ COMMITTED`，多个读操作互不干扰；Edit（读写）相当于 `SERIALIZABLE`，必须串行执行以避免竞态。区别在于：数据库由 DBMS 强制执行隔离级别，而 Claude Code 把并发安全声明**委托给工具作者自行标注**，系统不做二次校验——这是一个明确的"信任开发者"的工程取舍。

### 双路径调度器：批处理 vs 流式

工具调度实际上有**两条并行路径**，共享同一个底层执行入口 `runToolUse()`（位于 `toolExecution.ts`），但调度壳不同：

**批处理路径** — `runTools()`（`toolOrchestration.ts`）：
- 在非流式 API 调用中使用
- 先把一个 batch 中的工具调用按 `isConcurrencySafe` 分成并发组和串行组
- 并发组一次性 `Promise.all()` 执行，串行组逐个执行
- **contextModifier 处理**：并发工具的 contextModifier 被收集到 `queuedContextModifiers` 字典中，等整个并发批次完成后再依次 apply

**流式路径** — `StreamingToolExecutor`（`StreamingToolExecutor.ts`）：
- 在流式 API 调用中使用（绝大多数场景）
- 模型还在说话时就开始执行工具——边接收边执行
- **已知限制**：`StreamingToolExecutor` 对并发工具的 `contextModifier` 语义**支持不完整**（源码 `StreamingToolExecutor.ts:388-395` 注释原文承认："we currently don't support context modifiers for concurrent tools"）。只有非并发（串行）工具的 contextModifier 才会被应用，并发工具的 contextModifier 被静默忽略。目前没有内置工具在并发模式下使用 contextModifier，所以这个缺口还不影响实际行为，但如果未来有 MCP 工具需要在并发执行后修改上下文，这里就会出问题。
- **sibling abort 传播**：当一个 Bash 工具出错时，`siblingAbortController` 终止同批次其他工具的子进程（但不终止父级 query）——这是一个精细的错误隔离机制（源码 `StreamingToolExecutor.ts:48,301-318`）

> 💡 **通俗理解**：批处理路径像餐厅的"一次下单、按顺序出菜"模式；流式路径像自助餐厅的"边点边上"模式——你还在看菜单（模型还在生成），厨房已经开始做前面选好的菜了（工具已经开始执行）。

### Hooks 的完整能力：不只是"自定义逻辑"

管线中第 2 步（PreToolUse）和第 5 步（PostToolUse）的 Hooks 不是简单的"自定义逻辑"——它们有四种精确的能力：

**PreToolUse Hooks 四种能力**：
1. **修改输入**：在工具执行前调整参数
2. **提供额外上下文**：注入 attachment message
3. **阻断执行**：返回 block 信号，工具直接不执行
4. **影响权限决策**：通过 `resolveHookPermissionDecision()` 向权限系统提供 allow/block/ask 建议——但注意，hook 的 allow **不等于无条件放行**，权限系统仍会执行后续的 `checkRuleBasedPermissions()`（源码 `toolHooks.ts:373`）

**PostToolUse Hooks 四种能力**：
1. **追加消息**：返回 `AttachmentMessage`，注入额外信息到下一轮上下文
2. **修改结果**：改写工具输出（特别是 MCP 工具的 `updatedMCPToolOutput`）
3. **阻断继续推进**：返回 `hook_stopped_continuation` 控制信号——心跳循环看到这个信号后停止继续，即使模型还想调用更多工具
4. **失败路径治理**：`PostToolUseFailure` hooks 让工具执行失败也能被 hooks 捕获和处理，而不是直接抛异常

> **🔑 关键设计**：Hooks **不短路权限系统**。即使 PreToolUse hook 返回 `allow`，权限检查仍会继续执行。这是一个刻意的安全设计——hooks 提供的是"建议"而非"裁决"。

---

## 7. MCP 工具：外交使节带来的新能力

MCP（Model Context Protocol）服务器可以注册新工具。这些工具以 `mcp__服务器名__工具名` 的格式出现在工具列表中。

从系统的视角看，MCP 工具和内置工具走**完全相同的管线**——输入验证、权限检查、执行、结果序列化。唯一的区别是执行阶段：内置工具在本地进程内执行，MCP 工具通过 JSON-RPC 发送到外部服务器执行。"本地和远程工具共享统一接口"是分布式系统的标准做法（如 gRPC 的 location transparency），但在 AI Agent 领域这还是较新的实践——MCP 本身是 Anthropic 主导的开放标准，Claude Code 作为第一方产品天然深度集成并不意外。更值得关注的是 MCP 的**开放生态意义**——它定义了一个厂商中立的工具接口协议，使得第三方开发者可以编写一次工具服务器、接入任何支持 MCP 的 AI 客户端，而不是被绑定到某一个产品的私有扩展 API 上。

> **🔑 OS 类比：** MCP 工具就像**外卖服务**——你在 App 上下单（接口一样），但食物是在远处的餐厅制作的，做好后送到你手上。界面相同，执行地点不同。

**安全含义**：MCP 工具的权限默认比内置工具严格——因为外部服务器不在本地控制范围内。企业管理员可以通过 `allowedDomains`（域名白名单）和 `strictPluginOnlyCustomization`（严格插件限制）来控制哪些 MCP 服务器可以连接。

---

## 8. 延迟加载工具（Deferred Tools）

不是所有工具的 schema 都需要在启动时加载。`isDeferred` 标记的工具只在 AI 需要时才获取完整 schema。

**为什么**：每个工具的 schema 都要放进 system prompt，占用 token。40 个内置工具目录中如果有一半是冷门工具（如 `ScheduleCron`、`RemoteTrigger`），它们的 schema 白白浪费 token。延迟加载只在 AI 通过 `ToolSearch` 明确搜索后才加载。

**比喻**：图书馆不可能把所有书都摆在门口大厅——冷门书放在仓库里，需要时才取出来。`ToolSearch` 就是图书馆的检索系统。

这套节约策略的力度从 Skill 系统上可见一斑：Skill 列表被严格限制在上下文窗口的 1%，每条描述不超过 250 字符，大型工具输出写入磁盘而非上下文。这些数字体现了 Anthropic 对 token 预算的精打细算程度——每一个字符都被当作稀缺资源来管理。

**Token 经济学意义**：这个设计的收益不只是"少传几个 schema"这么简单。如果把全部 40 个内置工具目录的完整 schema 都塞进 system prompt，会造成两个问题：(1) 每次 API 请求都多消耗数千 token 的输入成本；(2) system prompt 体积膨胀会**降低 prompt cache 的命中率**——因为 Claude API 的 prompt cache 是按前缀匹配的，system prompt 越稳定、命中率越高，而动态工具列表（如 MCP 工具热插拔）会频繁改变 system prompt 导致缓存失效。延迟加载 + ToolSearch 的组合本质上是把工具发现从"静态全量广播"（所有 schema 一次性塞进 prompt）变成了"按需拉取"（AI 需要时才搜索加载），这是 Claude Code 在 token 成本和缓存效率上的一个关键优化。

---

## 9. 竞品工具体系对比

了解 Claude Code 的工具设计选择，需要把它放在竞品的坐标系中才能看出真正的差异化在哪里。

### Claude Code vs Cursor

| 维度 | Claude Code | Cursor |
|------|-------------|--------|
| 内置工具数 | 40 个内置工具目录，分四大类 | 十余个，以文件操作为主（具体数字未公开） |
| 权限粒度 | 十步检查链，支持模式匹配（如 `Bash(git *)`） | 二元模型：允许/拒绝，无细粒度模式匹配 |
| 扩展机制 | MCP 协议（开放标准，可接入任意工具服务器） | 内置扩展为主，第三方工具生态较封闭 |
| 工具并发 | 工具自行声明并发安全性，系统据此调度 | 主要串行执行 |
| 运行环境 | 终端原生，Bash 工具直接调用系统 shell | IDE 内嵌，通过编辑器 API 间接操作 |

**要点**：Cursor 的优势在 IDE 集成深度（代码补全、inline diff），Claude Code 的优势在工具的开放性和权限控制粒度。两者的架构哲学不同：Cursor 是"IDE 里加了 AI"，Claude Code 是"AI 配了一套操作系统级工具"。

### Claude Code vs Aider

Aider 是另一个主流的 AI 编码终端工具，但两者在文件编辑工具的设计上选择了截然不同的路径：

| 维度 | Claude Code | Aider |
|------|-------------|-------|
| 编辑方式 | `Edit`（差分替换）+ `Write`（全量覆盖），两个独立工具 | 三种 edit format 切换：`search/replace`、`whole file`、`diff` |
| 编辑粒度 | `Edit` 指定精确的 old_string → new_string 替换 | `search/replace` 类似，但 `whole file` 模式下 AI 输出整个文件 |
| 策略 | 系统提示词强制 AI 优先用 `Edit`（局部修改），`Write` 仅用于新建文件 | 根据模型能力自动选择 format（小模型用 whole file，大模型用 diff） |
| 安全网 | `fileHistoryTrackEdit()` 在修改前自动快照，支持 `/rewind` | 依赖 git auto-commit 回滚 |

**要点**：Aider 的 edit format 自适应策略更灵活（适配不同模型能力），Claude Code 的 Edit/Write 分离更简洁明确（减少模型的选择负担）。两者都是成熟的工程方案，各有取舍。

### Claude Code vs GitHub Copilot Workspace

GitHub Copilot Workspace 采用的是**计划驱动**的工具调用方式：先让 AI 生成完整的编辑计划（哪些文件需要改、每个文件改什么），然后一次性批量执行。对比 Claude Code 的**循环驱动**方式（每轮循环 AI 决定下一步），差异明显：

- **Copilot Workspace**：计划阶段和执行阶段分离，用户在执行前可以审查完整计划。缺点是计划一旦出错，整个批次需要重做。
- **Claude Code**：每步执行后 AI 根据结果决定下一步（第 4 章的 queryLoop），更灵活但中间过程更难预测。工具的 `renderResultForAssistant()` 在这里至关重要——每一步的结果格式直接影响 AI 下一步的决策质量。

---

## 10. 工具描述即隐性指令：三个代表性案例

工具描述（`description()`）不只是"告诉 AI 这个工具干什么"——它是嵌入在工具元数据里的**隐性行为指令**。Claude Code 的几个核心工具把大量复杂的操作规程直接写进了工具描述，让模型在"了解工具能力"的同时，同步接受行为规范的训练。下面是三个最具代表性的案例。

---

### 案例 1：BashTool 的 Git Safety Protocol

**源码位置**：`src/tools/BashTool/prompt.ts`，`getCommitAndPRInstructions()` 函数（第 42–161 行）

BashTool 的描述中内嵌了一段 **120 行**的完整 Git 操作规程（`prompt.ts` 全文 369 行，`getCommitAndPRInstructions()` 占第 42–161 行，约 1/3 篇幅），包括 commit 和 PR 的逐步流程、七条安全禁令（"Git Safety Protocol"）、并行工具调用策略、以及 HEREDOC 格式化要求。这是 Claude Code 中最长、最复杂的单个工具描述。

以下是面向外部用户的完整文本（`getCommitAndPRInstructions()` 的外部用户分支，占工具描述主体）：

```
# Committing changes with git

Only create commits when requested by the user. If unclear, ask first. When the user asks you to create a new git commit, follow these steps carefully:

You can call multiple tools in a single response. When multiple independent pieces of information are requested and all commands are likely to succeed, run multiple tool calls in parallel for optimal performance. The numbered steps below indicate which commands should be batched in parallel.

Git Safety Protocol:
- NEVER update the git config
- NEVER run destructive git commands (push --force, reset --hard, checkout ., restore ., clean -f, branch -D) unless the user explicitly requests these actions. Taking unauthorized destructive actions is unhelpful and can result in lost work, so it's best to ONLY run these commands when given direct instructions 
- NEVER skip hooks (--no-verify, --no-gpg-sign, etc) unless the user explicitly requests it
- NEVER run force push to main/master, warn the user if they request it
- CRITICAL: Always create NEW commits rather than amending, unless the user explicitly requests a git amend. When a pre-commit hook fails, the commit did NOT happen — so --amend would modify the PREVIOUS commit, which may result in destroying work or losing previous changes. Instead, after hook failure, fix the issue, re-stage, and create a NEW commit
- When staging files, prefer adding specific files by name rather than using "git add -A" or "git add .", which can accidentally include sensitive files (.env, credentials) or large binaries
- NEVER commit changes unless the user explicitly asks you to. It is VERY IMPORTANT to only commit when explicitly asked, otherwise the user will feel that you are being too proactive

1. Run the following bash commands in parallel, each using the Bash tool:
  - Run a git status command to see all untracked files. IMPORTANT: Never use the -uall flag as it can cause memory issues on large repos.
  - Run a git diff command to see both staged and unstaged changes that will be committed.
  - Run a git log command to see recent commit messages, so that you can follow this repository's commit message style.
2. Analyze all staged changes (both previously staged and newly added) and draft a commit message:
  - Summarize the nature of the changes (eg. new feature, enhancement to an existing feature, bug fix, refactoring, test, docs, etc.). Ensure the message accurately reflects the changes and their purpose (i.e. "add" means a wholly new feature, "update" means an enhancement to an existing feature, "fix" means a bug fix, etc.).
  - Do not commit files that likely contain secrets (.env, credentials.json, etc). Warn the user if they specifically request to commit those files
  - Draft a concise (1-2 sentences) commit message that focuses on the "why" rather than the "what"
  - Ensure it accurately reflects the changes and their purpose
3. Run the following commands in parallel:
   - Add relevant untracked files to the staging area.
   - Create the commit with a message ending with:
   Co-Authored-By: Claude <noreply@anthropic.com>
   - Run git status after the commit completes to verify success.
   Note: git status depends on the commit completing, so run it sequentially after the commit.
4. If the commit fails due to pre-commit hook: fix the issue and create a NEW commit

Important notes:
- NEVER run additional commands to read or explore code, besides git bash commands
- NEVER use the TodoWrite or Agent tools
- DO NOT push to the remote repository unless the user explicitly asks you to do so
- IMPORTANT: Never use git commands with the -i flag (like git rebase -i or git add -i) since they require interactive input which is not supported.
- IMPORTANT: Do not use --no-edit with git rebase commands, as the --no-edit flag is not a valid option for git rebase.
- If there are no changes to commit (i.e., no untracked files and no modifications), do not create an empty commit
- In order to ensure good formatting, ALWAYS pass the commit message via a HEREDOC, a la this example:
<example>
git commit -m "$(cat <<'EOF'
   Commit message here.

   Co-Authored-By: Claude <noreply@anthropic.com>
   EOF
   )"
</example>

# Creating pull requests
Use the gh command via the Bash tool for ALL GitHub-related tasks including working with issues, pull requests, checks, and releases. If given a Github URL use the gh command to get the information needed.

IMPORTANT: When the user asks you to create a pull request, follow these steps carefully:

1. Run the following bash commands in parallel using the Bash tool, in order to understand the current state of the branch since it diverged from the main branch:
   - Run a git status command to see all untracked files (never use -uall flag)
   - Run a git diff command to see both staged and unstaged changes that will be committed
   - Check if the current branch tracks a remote branch and is up to date with the remote, so you know if you need to push to the remote
   - Run a git log command and `git diff [base-branch]...HEAD` to understand the full commit history for the current branch (from the time it diverged from the base branch)
2. Analyze all changes that will be included in the pull request, making sure to look at all relevant commits (NOT just the latest commit, but ALL commits that will be included in the pull request!!!), and draft a pull request title and summary:
   - Keep the PR title short (under 70 characters)
   - Use the description/body for details, not the title
3. Run the following commands in parallel:
   - Create new branch if needed
   - Push to remote with -u flag if needed
   - Create PR using gh pr create with the format below. Use a HEREDOC to pass the body to ensure correct formatting.
<example>
gh pr create --title "the pr title" --body "$(cat <<'EOF'
## Summary
<1-3 bullet points>

## Test plan
[Bulleted markdown checklist of TODOs for testing the pull request...]

🤖 Generated with Claude Code
EOF
)"
</example>

Important:
- DO NOT use the TodoWrite or Agent tools
- Return the PR URL when you're done, so the user can see it

# Other common operations
- View comments on a Github PR: gh api repos/foo/bar/pulls/123/comments
```

**中文分析**：这段文本揭示了 Anthropic 的一个核心工程哲学：**把操作规范写进工具描述，而不是写进 system prompt**。这样做有几个好处：

1. **语境绑定**：规范与工具绑定，而非悬空在 system prompt 中。AI 在决定"是否要用 Bash 做 git 操作"时，这些约束自然地出现在工具能力的描述里。
2. **模块化**：每个工具的规范自包含，便于维护和版本管理。`BashTool/prompt.ts` 独立于 `systemPrompt.ts`，可以单独更新。
3. **差异化**：注意代码中的 `process.env.USER_TYPE === 'ant'` 分支——内部用户（ant）看到的是指向 `/commit`、`/commit-push-pr` Skills 的简短版本，外部用户看到的是上面的完整内嵌版本。**同一个工具，对不同用户类型展示不同的描述**，这是动态 `description()` 的精妙运用。
4. **`--amend` 陷阱警告**：注意 CRITICAL 条款——"When a pre-commit hook fails, the commit did NOT happen — so --amend would modify the PREVIOUS commit"。这是一个来自真实踩坑经验的精确警告，防止 AI 在 hook 失败后误用 `--amend`。

**Anthropic 内部版本 vs 外部版本的设计差异**（`getCommitAndPRInstructions()` 第 56–76 行）：内部用户被引导去调用专门的 `/commit` 和 `/commit-push-pr` Skills，外部用户则内嵌了完整流程。这表明 Anthropic 内部把这些操作封装成了更精细的 Skill，而对外保持了"无需额外配置即可使用"的开箱即用体验。

---

### 案例 2：AgentTool 的完整描述

**源码位置**：`src/tools/AgentTool/prompt.ts`，`getPrompt()` 函数（第 66–287 行）

AgentTool 的描述教会模型**如何生成子 Agent、何时生成、如何写给子 Agent 的 prompt**。这个描述本身就是一份 Agent 编排手册。

核心结构（精简版）：

```
Launch a new agent to handle complex, multi-step tasks autonomously.

The Agent tool launches specialized agents (subprocesses) that autonomously handle complex tasks. Each agent type has specific capabilities and tools available to it.

[Available agent types listed here]

When using the Agent tool, specify a subagent_type parameter to select which agent type to use. If omitted, the general-purpose agent is used.

When NOT to use the Agent tool:
- If you want to read a specific file path, use the Read tool or Glob tool instead of the Agent tool, to find the match more quickly
- If you are searching for a specific class definition like "class Foo", use the Glob tool instead, to find the match more quickly
- If you are searching for code within a specific file or set of 2-3 files, use the Read tool instead of the Agent tool, to find the match more quickly
- Other tasks that are not related to the agent descriptions above

Usage notes:
- Always include a short description (3-5 words) summarizing what the agent will do
- Launch multiple agents concurrently whenever possible, to maximize performance; to do that, use a single message with multiple tool uses
- When the agent is done, it will return a single message back to you. The result returned by the agent is not visible to the user. To show the user the result, you should send a text message back to the user with a concise summary of the result.
- You can optionally run agents in the background using the run_in_background parameter. When an agent runs in the background, you will be automatically notified when it completes — do NOT sleep, poll, or proactively check on its progress. Continue with other work or respond to the user instead.
- **Foreground vs background**: Use foreground (default) when you need the agent's results before you can proceed — e.g., research agents whose findings inform your next steps. Use background when you have genuinely independent work to do in parallel.
- To continue a previously spawned agent, use SendMessage with the agent's ID or name as the `to` field. The agent resumes with its full context preserved. Each Agent invocation starts fresh — provide a complete task description.
- The agent's outputs should generally be trusted
- Clearly tell the agent whether you expect it to write code or just to do research (search, file reads, web fetches, etc.), since it is not aware of the user's intent
- If the agent description mentions that it should be used proactively, then you should try your best to use it without the user having to ask for it first. Use your judgement.
- If the user specifies that they want you to run agents "in parallel", you MUST send a single message with multiple Agent tool use content blocks.
- You can optionally set `isolation: "worktree"` to run the agent in a temporary git worktree, giving it an isolated copy of the repository.

## Writing the prompt

Brief the agent like a smart colleague who just walked into the room — it hasn't seen this conversation, doesn't know what you've tried, doesn't understand why this task matters.
- Explain what you're trying to accomplish and why.
- Describe what you've already learned or ruled out.
- Give enough context about the surrounding problem that the agent can make judgment calls rather than just following a narrow instruction.
- If you need a short response, say so ("report in under 200 words").
- Lookups: hand over the exact command. Investigations: hand over the question — prescribed steps become dead weight when the premise is wrong.

Terse command-style prompts produce shallow, generic work.

**Never delegate understanding.** Don't write "based on your findings, fix the bug" or "based on the research, implement it." Those phrases push synthesis onto the agent instead of doing it yourself. Write prompts that prove you understood: include file paths, line numbers, what specifically to change.
```

**中文分析**：这段描述揭示了 Claude Code 如何**通过工具描述教授 Agent 编排**：

1. **明确列出"不要用"的场景**：这是一个反直觉但极其重要的设计——工具描述中专门列出了"当你想读文件时不要用 Agent，用 Read 工具"。这防止了模型过度使用重量级工具（创建子 Agent）来完成轻量级任务（读一个文件）。这相当于在工具说明书上写"此工具不适合小任务"。
2. **"Writing the prompt"子章节**：工具描述中内嵌了一份子 Agent prompt 写作指南，包含"brief the agent like a smart colleague"的比喻，以及"Never delegate understanding"这样的精确警告。这是**元提示词**——用提示词告诉 AI 怎么写提示词。
3. **Fork 模式的特殊处理**（第 80–97 行）：当 `isForkSubagentEnabled()` 为真时，描述中会增加"When to fork"章节，指导 AI 何时应该 fork 自身（继承上下文）而非创建全新子 Agent。这是一个通过 feature flag 动态注入行为规范的典型案例。
4. **并发指令**：描述中明确要求"If the user specifies that they want you to run agents 'in parallel', you MUST send a single message with multiple Agent tool use content blocks"——这是把 UI 交互规范（如何在 API 层面表达并行）写进了工具描述。

---

### 案例 3：EnterPlanMode 的规划时机决策树

**源码位置**：`src/tools/EnterPlanModeTool/prompt.ts`，`getEnterPlanModeToolPromptExternal()` 函数（第 16–99 行）

EnterPlanMode 的描述是一份完整的**规划模式启用决策手册**，教 AI 在什么情况下应该进入规划模式（而不是直接开始编码）。

```
Use this tool proactively when you're about to start a non-trivial implementation task. Getting user sign-off on your approach before writing code prevents wasted effort and ensures alignment. This tool transitions you into plan mode where you can explore the codebase and design an implementation approach for user approval.

## When to Use This Tool

**Prefer using EnterPlanMode** for implementation tasks unless they're simple. Use it when ANY of these conditions apply:

1. **New Feature Implementation**: Adding meaningful new functionality
   - Example: "Add a logout button" - where should it go? What should happen on click?
   - Example: "Add form validation" - what rules? What error messages?

2. **Multiple Valid Approaches**: The task can be solved in several different ways
   - Example: "Add caching to the API" - could use Redis, in-memory, file-based, etc.
   - Example: "Improve performance" - many optimization strategies possible

3. **Code Modifications**: Changes that affect existing behavior or structure
   - Example: "Update the login flow" - what exactly should change?
   - Example: "Refactor this component" - what's the target architecture?

4. **Architectural Decisions**: The task requires choosing between patterns or technologies
   - Example: "Add real-time updates" - WebSockets vs SSE vs polling
   - Example: "Implement state management" - Redux vs Context vs custom solution

5. **Multi-File Changes**: The task will likely touch more than 2-3 files
   - Example: "Refactor the authentication system"
   - Example: "Add a new API endpoint with tests"

6. **Unclear Requirements**: You need to explore before understanding the full scope
   - Example: "Make the app faster" - need to profile and identify bottlenecks
   - Example: "Fix the bug in checkout" - need to investigate root cause

7. **User Preferences Matter**: The implementation could reasonably go multiple ways
   - If you would use AskUserQuestion to clarify the approach, use EnterPlanMode instead
   - Plan mode lets you explore first, then present options with context

## When NOT to Use This Tool

Only skip EnterPlanMode for simple tasks:
- Single-line or few-line fixes (typos, obvious bugs, small tweaks)
- Adding a single function with clear requirements
- Tasks where the user has given very specific, detailed instructions
- Pure research/exploration tasks (use the Agent tool with explore agent instead)

## What Happens in Plan Mode

In plan mode, you'll:
1. Thoroughly explore the codebase using Glob, Grep, and Read tools
2. Understand existing patterns and architecture
3. Design an implementation approach
4. Present your plan to the user for approval
5. Use AskUserQuestion if you need to clarify approaches
6. Exit plan mode with ExitPlanMode when ready to implement

## Examples

### GOOD - Use EnterPlanMode:
User: "Add user authentication to the app"
- Requires architectural decisions (session vs JWT, where to store tokens, middleware structure)

User: "Optimize the database queries"
- Multiple approaches possible, need to profile first, significant impact

User: "Implement dark mode"
- Architectural decision on theme system, affects many components

User: "Add a delete button to the user profile"
- Seems simple but involves: where to place it, confirmation dialog, API call, error handling, state updates

User: "Update the error handling in the API"
- Affects multiple files, user should approve the approach

### BAD - Don't use EnterPlanMode:
User: "Fix the typo in the README"
- Straightforward, no planning needed

User: "Add a console.log to debug this function"
- Simple, obvious implementation

User: "What files handle routing?"
- Research task, not implementation planning

## Important Notes

- This tool REQUIRES user approval - they must consent to entering plan mode
- If unsure whether to use it, err on the side of planning - it's better to get alignment upfront than to redo work
- Users appreciate being consulted before significant changes are made to their codebase
```

**中文分析**：EnterPlanMode 的描述揭示了 Anthropic 的一个重要设计选择：**不依赖 AI 的"自然判断"来决定何时规划，而是通过工具描述硬编码决策规则**。

1. **"Prefer using EnterPlanMode"的强制语气**：注意描述的第一句"Use this tool **proactively**"——这是对 AI 的主动性指令，不是被动的能力描述。工具描述在教 AI "默认应该规划，而不是默认直接动手"。
2. **七条触发规则 + 四条排除规则**：这套规则本质上是一个决策树，把"是否进入规划模式"这个判断从模型的"自由裁量"转化为对规则的**匹配**。这减少了模型行为的不确定性——不同的 AI 实例在面对同一个任务时，应该做出相同的规划/不规划决策。
3. **内外有别的提示词**（`getEnterPlanModeToolPromptAnt()` 第 101–163 行）：内部用户（ant）看到的是一个**更精简、更严格**的版本——"When in doubt, prefer starting work and using AskUserQuestion for specific questions over entering a full planning phase"。内部版本更相信 AI 的判断力，外部版本更倾向于强制规划。这体现了 Anthropic 对不同用户场景下规划开销的权衡：外部用户的任务更多样，规划的价值更高；内部专家用户的任务往往更明确，规划可能反而是浪费。
4. **"Use this tool proactively"的深层含义**：在工具描述里写"proactively"，意味着 AI 在没有用户明确要求时也应该主动触发这个工具。这是 Claude Code 架构中"工具描述 = 行为指令"模式的最直接体现——不是 system prompt 说"你要规划"，而是工具自身说"在这些情况下你应该用我"。

---

**三个案例的共同规律**：

| 工具 | 描述类型 | 核心设计意图 |
|------|----------|--------------|
| BashTool Git 指令 | 操作规程（步骤 + 禁令） | 防止毁坏数据的高风险操作被错误执行 |
| AgentTool | 元提示词（如何写给子 Agent 的 prompt） | 教 AI 如何有效地将工作委派给子 Agent |
| EnterPlanMode | 决策树（何时触发规划） | 规范化"规划 vs 直接执行"的判断标准 |

这三个案例说明：Claude Code 的"工具"不只是能力容器，而是**行为规范的载体**。Anthropic 把大量本可以放在 system prompt 里的指令，拆解并绑定到了最相关的工具的描述里。这样做的代价是工具描述变得极长（BashTool 的描述显著膨胀），收益是规范与工具的语境绑定，以及模块化的可维护性。

---

## 11. 设计取舍

### 优秀

1. **统一的 Tool 接口**让 40 个内置工具目录和无限数量的 MCP 工具共享同一套管线——注册、权限、执行、结果处理全部复用
2. **AsyncGenerator 返回值**让工具可以在执行过程中持续报告进度（UI 实时更新），而不是执行完才返回
3. **Edit/Write 分离**是对 AI 行为的精确工程——大多数场景只需要 Edit（局部修改），Write（全量覆盖）保留给创建新文件
4. **延迟加载工具**在 token 经济上的收益是实实在在的——每省一个 schema 就省几百 token 的 system prompt 成本
5. **`renderResultForAssistant()` 的存在**说明团队认真思考了"AI 需要看到什么格式的结果才能做出最好的决策"

### 代价

1. **20+ 个接口方法**让实现新工具的门槛不低——这是灵活性和易用性的经典取舍
2. **并发安全声明是自报的**——如果工具作者声明错误，系统不会检测到（也很难检测到）
3. **MCP 工具和内置工具的权限语义不完全一致**——内置工具的权限模式匹配更细粒度（如 `Bash(git *)`），MCP 工具只能按服务器级别控制
4. **大结果的持久化策略是"落盘+引用"而非"智能摘要"**——超出阈值的结果被完整写入 `~/.claude/tool-results/`，对话正文只保留引用型预览（`<persisted-output>` 标签）。这保证了信息不丢失，但模型在后续回合中看到的是摘要而非原文，可能影响决策精度。GrowthBook 可远程调整每个工具的持久化阈值（`toolResultStorage.ts`）
5. **动态工具列表**增加了 prompt cache 的 miss 概率——每次 MCP 服务器变化，工具 schema 部分的 cache 失效

---

> **[图表预留 2.5-A]**：工具执行管线全景图 — 从 AI 请求到结果返回的 7 步管线
> **[图表预留 2.5-B]**：40 个内置工具目录的四大类分布图 — 文件操作/执行引擎/Agent家族/扩展辅助
> **[图表预留 2.5-C]**：并发安全矩阵 — 哪些工具组合可以并行执行