# System Architecture in Five Minutes

This chapter is your map. It doesn't aim for every detail—just orientation.

> 📚 **Course Connection**: If you've studied operating systems, Claude Code's architecture is strikingly similar—it has its own "boot sequence" (startup flow), "system call table" (directory of 40 built-in tools), "process scheduler" (Agent orchestration), "permission system" (ten-step check chain), and "device driver interface" (MCP protocol). This chapter gives you the big picture in five minutes; Part 2 will dive into each subsystem one by one.

---

## The Core Execution Path (from pressing Enter to the AI responding)

```
① User sends a message
         ↓
② QueryEngine.submitMessage()
   Gathers system context (git status, CLAUDE.md, current date)
         ↓
③ queryLoop() (query.ts)
   Core while(true) loop:
   
   ┌─────────────────────────────────────────┐
   │ a. Call Anthropic API (stream mode)     │
   │ b. Read response stream                 │
   │    ├── text block → display to user     │
   │    └── tool_use block                   │
   │        └──→ StreamingToolExecutor.addTool│
   │             (starts immediately,         │
   │              without waiting for model   │
   │              to finish)                  │
   │ c. Model stops outputting                │
   │ d. Collect tool execution results        │
   │ e. Build tool_result message             │
   │ f. Compress context if needed            │
   │ g. Go back to a for next round           │
   └─────────────────────────────────────────┘
         ↓
④ AI produces final answer (no tool calls)
         ↓
⑤ Background hooks run:
   SessionMemory extraction, Prompt Suggestion generation
         ↓
⑥ [Optional] Speculation starts to predict
   and execute the next round
```

---

## Seven Core Abstractions

> 🌍 **Industry Context**: Claude Code's seven abstractions (Tool, ToolUseContext, AppState, QueryEngine, Task, Command, SystemPrompt) form a complete Agent runtime. **LangChain** uses fewer abstractions (Tool + Agent + Memory) but is less flexible. **Cursor** hasn't open-sourced its internal architecture, but its behavior suggests a VS Code extension model. **Aider** has the simplest architecture—its core consists of just three classes: Coder, Repo, and Model, with no AppState or Task concepts. Claude Code chose the "many abstractions, each with a single responsibility" path. This increases the learning curve, but makes every part of the system independently testable and replaceable.

> 💡 **Plain English**: From pressing Enter to getting an AI response, it's like a **package sorting conveyor belt**—receive (user input) → sorting center processing (`queryLoop` cycle: call API, execute tools, call API again) → final delivery (AI gives an answer). The belt keeps running until all packages for this batch are delivered.

### Tool — The Universal Interface for Every Tool

```typescript
interface Tool {
  name: string
  call(args, context, canUseTool): AsyncGenerator<ProgressMessage, ToolResult>
  checkPermissions(input, context): Promise<PermissionResult>
  isConcurrencySafe(input): boolean
  isReadOnly(input): boolean
  description(): Promise<string>
  // ... about 20 methods/properties
}
```

Every tool (Read, Edit, Bash, AgentTool, etc.) implements this interface. The interface defines all interaction points between a tool and the rest of the system: permissions, concurrency, description, UI rendering, lazy loading...

### ToolUseContext — The "World" During Tool Execution

```typescript
type ToolUseContext = {
  getAppState(): AppState       // Read application state
  setAppState(updater)          // Modify application state
  messages: Message[]            // Current conversation history
  abortController: AbortController // Can be cancelled
  readFileState: FileStateCache  // File cache
  options: QueryOptions          // Model configuration, etc.
}
```

Every tool call carries this context. Child Agents have an independent `ToolUseContext`, where `setAppState` is a no-op (preventing child Agents from modifying the parent process state).

### AppState — Global State Snapshot

The single source of truth for system runtime state, including:
- Current permission context (`toolPermissionContext`)
- MCP connection list (`mcp.clients`)
- Running tasks (`tasks`)
- Speculation state (`speculation`)
- User settings (`settings`)

Managed via React `createStore` (a minimal Redux-like store), maintained only in the React component tree on the main thread.

### QueryEngine — The Session Container

Each conversation session has one `QueryEngine` instance, holding:
- Message history
- System prompt (including tool list, git status, CLAUDE.md)
- Model configuration
- Entry point for submitting messages (`submitMessage()`)

### Task — The Proxy Object for Child Agents

When a child Agent is running, the main thread holds a `TaskState` object (stored in `AppState.tasks`), containing the Agent's state, output stream, abort controller, etc.

### Command — Slash Commands

Triggered when the user enters slash commands like `/commit` or `/help`. Unlike tools (which are called by the AI), commands are **invoked directly by the user** and have an independent handling path. The system has about 40 built-in commands.

### SystemPrompt — The Complete Context Package Sent to the AI

Not a single string, but a structured object, including:
- Tool description list
- User context (CLAUDE.md, current date)
- System context (git status)
- Coordinator prompt (if in coordinator mode)

---

## Three Layers of Permission Control

```
Layer 1: Rules (static configuration)
  always-allow / always-deny / always-ask
  From settings.json, CLAUDE.md configuration

Layer 2: Tool self-check (dynamic)
  Each tool's checkPermissions() validates current input
  E.g., FileEditTool checks if the path is in a sensitive
  directory like .git/

Layer 3: User confirmation (interactive)
  If the above don't give a definitive answer,
  a confirmation dialog pops up
  In auto mode, an AI classifier replaces user confirmation
```

---

## Two Tool Execution Modes

**Traditional path (`runTools`)**:
Wait for the AI to completely stop outputting → batch by concurrency safety → execute (read-only tools in parallel, write operations serially)

**Streaming path (`StreamingToolExecutor`)**:
Every time the AI outputs a complete tool_use block → start executing immediately
By the time the model stops outputting, most tools are already done

> 📚 **Course Connection**: The traditional path is classic **batch processing**; the streaming path is **stream processing**—exactly analogous to the MapReduce (batch) vs. Apache Flink (stream) comparison in big data courses. The core value of the streaming path is **reducing latency**: instead of waiting for all data to arrive before processing, it processes each piece of data as soon as it arrives.

---

## Context Compression's Six Layers of Defense

> 💡 **Plain English**: The six layers of compression are like a **meeting note-taker's tiered processing strategy**—first fold extra-long attachments → then delete useless pages → then remove duplicates → then summarize paragraphs → then create a full-text executive summary → finally emergency compression. From light to heavy, the first three steps are enough most of the time.

(See Chapter Q02 for details)

```
Token consumption: ─────────────────────────────────────→
           toolResultBudget → snipCompact → microcompact
           → contextCollapse → autocompact → reactiveCompact
```

Each layer triggers at a different token consumption threshold, progressively moving from "discard tool results" to "full conversation summary".

---

## The One Fact You Need to Know Most

**Every design decision in this system is trying to answer the same question:**

> How do we make AI work reliably, efficiently, and safely on real engineering tasks?

All the complexity—multi-layer permissions, streaming parallelism, context compression, speculation—is an answer to some dimension of that question.

When you see something in the code and wonder "why is it like this?", ask yourself first: which dimension of the problem is it solving?

---

## Key Source Code Entry Points

- `src/main.tsx`: Startup entry—Pre-import preloading + initialization sequence + React rendering
- `src/services/api/claude.ts`: Core of the `queryLoop`—the `query()` function drives the Agent Loop
- `src/tools/tools.ts`: Tool registry—unified registration of the 40 built-in tool directory
- `src/utils/permissions/permissions.ts`: Permission system—complete implementation of the 10-step state machine

## Industry Consensus Framework: Model + Runtime + Harness

> 📚 **Industry Consensus Framework: Model + Runtime + Harness**
>
> Harrison Chase, founder of LangChain, proposed a three-layer framework widely accepted in the industry:
> - **Model Layer** (the brain)—Foundational large model capabilities (Claude, GPT, Gemini...), responsible for "thinking"
> - **Runtime Layer** (the nervous system)—Call orchestration, tool routing, context management, responsible for "transmitting signals"
> - **Harness Layer** (the body + senses + limbs)—The complete product experience facing the user, responsible for "perceiving the environment and executing actions"
>
> 💡 **Plain English**: Just as the same engine (Model) installed in different chassis (Harness) becomes a sports car, a truck, or an ambulance—Claude Code, Cursor, and Windsurf all use the same Claude brain, but their "chassis" designs are completely different, and that's what determines each product's user experience. This book disassembles exactly how Claude Code's "chassis" is built.

If you map the seven core abstractions introduced in this chapter against this framework, you'll find they almost all fall in the Harness layer: the Tool interface defines the boundary of user-perceivable tool capabilities, QueryEngine orchestrates the core loop of the user experience, AppState manages product-level global state, and the permission system shapes the user's sense of security. The Model layer (the Claude model itself) and the Runtime layer (API calls, streaming transport) are just "plumbing" in this chapter—the real engineering complexity and product differentiation are all in the Harness layer.

---

## The Limits of This Map

However, a five-minute overview inevitably has limits. This map omits a vast amount of edge cases and trade-offs—such as fallback strategies for streaming tool execution during network instability, the risk of context compression losing critical information, and the possibility of the AI classifier making wrong permission judgments in Auto mode. If you finish this chapter thinking "I get it," that's precisely a sign you need to keep reading. Complexity doesn't hide in the main happy path—it lurks in exception paths, concurrency conflicts, and security boundaries.