s12: Task System — Break Big Goals into Small Tasks

s01 → ... → s10 → s11 → s12 → s13 → s14 → ... → s20

"Break big goals into small tasks, order them, persist" — File-persisted task graph, the foundation for multi-agent collaboration.

Harness Layer: Tasks — Persisted goals, recoverable progress.

The Problem

The agent receives a project: set up a database, write APIs, add tests. It uses s05's TodoWrite to create a checklist, then starts writing the API first, gets halfway through and realizes there are no database tables, goes back to fix them; when adding tests, discovers the API interface signatures have changed again...

You can't build the roof before laying the foundation. Tasks have ordering. Task dependencies should form a Directed Acyclic Graph (DAG); the teaching version only demonstrates blockedBy checking, without cycle detection.

s05's TodoWrite is an execution checklist for the current task, kept in session memory. What you need here is a task system: each task is a JSON file, tasks have blockedBy dependencies, and they persist across sessions on disk.

The Solution

Task System Overview )

Teaching code keeps a basic agent loop, omitting S11's full error recovery (RecoveryState, backoff, escalation, reactive compact, fallback model) to stay focused on the task system. Added: 5 new task tools + .tasks/ directory for persistence + blockedBy dependency checking. The task system and error recovery are independent layers: in CC source, utils/tasks.ts only handles CRUD, while query.ts's with_retry/RecoveryState handles error recovery, with no coupling between them.

TodoWrite vs Task System:

	TodoWrite (s05)	Task System (s12)
Role	Execution checklist for the current task	Recoverable task system
Storage	In-process / session state	`.tasks/{id}.json`
Dependencies	None	`blockedBy` / `blocks` graph
Lifecycle	Current session / current task	Cross-session
Coordination	No task claiming	`owner` / claim
Status	pending / in_progress / completed	pending / in_progress / completed
Granularity	The agent's own steps	Tasks that can be claimed, tracked, and unblocked

How It Works

Task DAG )

Task: Data Structure

Each task is a JSON file, stored in the .tasks/ directory:

@dataclass
class Task:
    id: str
    subject: str
    description: str
    status: str          # pending | in_progress | completed
    owner: str | None    # Agent name (multi-agent scenarios)
    blockedBy: list[str] # List of dependency task IDs

IDs are generated with timestamp + random hex, simple but sufficient. CC uses sequential IDs + a highwatermark file to prevent ID reuse, which is a more rigorous design.

create_task: Create Tasks

def create_task(subject: str, description: str = "",
                blockedBy: list[str] | None = None) -> Task:
    task = Task(
        id=f"task_{int(time.time())}_{random_hex(4)}",
        subject=subject, description=description,
        status="pending", owner=None,
        blockedBy=blockedBy or [],
    )
    save_task(task)
    return task

Automatically calls save_task on creation to write .tasks/{id}.json. blockedBy declares dependencies, for example "write API" has blockedBy: ["task_schema"].

can_start: Dependency Check

A task can only start after all its blockedBy dependencies are completed:

def can_start(task_id: str) -> bool:
    task = load_task(task_id)
    for dep_id in task.blockedBy:
        if not _task_path(dep_id).exists():
            return False  # missing dependency = blocked
        dep = load_task(dep_id)
        if dep.status != "completed":
            return False
    return True

can_start is a prerequisite check for claim_task: if any blockedBy dependency is not completed, the task cannot be claimed. Missing dependencies are treated as blocked, avoiding crashes from referencing wrong IDs.

claim_task: Claim a Task

When the agent starts working on a task, it calls claim_task: sets owner, changes status from pending → in_progress. The owner field records who is working on the task, preventing duplicate claims in multi-agent scenarios:

def claim_task(task_id: str, owner: str = "agent") -> str:
    task = load_task(task_id)
    if task.status != "pending":
        return f"Task {task_id} is {task.status}, cannot claim"
    if not can_start(task_id):
        deps = [d for d in task.blockedBy
                if load_task(d).status != "completed"]
        return f"Blocked by: {deps}"
    task.owner = owner
    task.status = "in_progress"
    save_task(task)
    return f"Claimed {task_id} ({task.subject})"

If the task is already claimed by someone else (status != "pending"), or dependencies aren't met (can_start returns False), the claim is rejected.

complete_task: Complete and Unblock

When a task is done, set it to completed. Simultaneously scan all other tasks to find downstream tasks that were just unblocked:

def complete_task(task_id: str) -> str:
    task = load_task(task_id)
    task.status = "completed"
    save_task(task)
    # Find newly unblocked downstream tasks
    unblocked = [t.subject for t in list_tasks()
                 if t.status == "pending" and t.blockedBy
                 and can_start(t.id)]
    msg = f"Completed {task_id} ({task.subject})"
    if unblocked:
        msg += f"\nUnblocked: {', '.join(unblocked)}"
    return msg

After completing "schema", can_start returns True for "endpoints" and "docs"; they can begin.

get_task: View Full Details

list_tasks only shows a one-line summary. get_task returns the full task JSON, including description and dependency details. When recovering across sessions, the agent needs to read the full description to continue work:

def get_task(task_id: str) -> str:
    task = load_task(task_id)
    return json.dumps(asdict(task), indent=2)

State Machine: Two Actions, Three States

pending ──claim──→ in_progress ──complete──→ completed

Here claim / complete are actions, while pending / in_progress / completed are states:

claim_task: pending → in_progress. Sets owner, begins work.
complete_task: in_progress → completed. Marks the task done and unblocks downstream.

CC has no in_progress → pending release path. If a teammate terminates or shuts down, CC unassigns its unfinished tasks (clears owner) and resets status to pending, allowing other agents to reclaim them. The teaching version omits this recovery path.

Putting It Together

# Create tasks with dependencies
schema = create_task("setup database schema")
endpoints = create_task("create API endpoints", blockedBy=[schema.id])
tests = create_task("write tests", blockedBy=[endpoints.id])
docs = create_task("write docs", blockedBy=[schema.id])

# Agent claims the first available task
claim_task(schema.id)       # ✓ Claimed (no dependencies)
complete_task(schema.id)    # ✓ Completed → unblocks endpoints, docs

claim_task(endpoints.id)    # ✓ Claimed (schema completed)
complete_task(endpoints.id) # ✓ Completed → unblocks tests

claim_task(docs.id)         # ✓ Claimed (schema completed)
complete_task(docs.id)      # ✓ Completed

claim_task(tests.id)        # ✓ Claimed (endpoints completed)
complete_task(tests.id)     # ✓ Completed

Each create_task writes a JSON file, each claim_task / complete_task updates the file. Across sessions, the .tasks/ directory persists — the agent reads the files to recover progress.

Changes from s11

Component	Before (s11)	After (s12)
Task management	None	Task dataclass + 5 tools
New types	—	Task (id, subject, description, status, owner, blockedBy)
Storage	No persistence	`.tasks/{id}.json` cross-session
Dependencies	None	`blockedBy` graph + `can_start` check
Tools	bash, read_file, write_file (3)	+ create_task, list_tasks, get_task, claim_task, complete_task (8)
Lifecycle	—	pending → in_progress → completed (no release rollback)

Try It

cd learn-claude-code
python s12_task_system/code.py

Try these prompts:

Create tasks: setup database schema, create API endpoints (depends on schema), write tests (depends on endpoints), write docs (depends on schema)
List all tasks and their statuses
Claim the first unblocked task and complete it
List tasks again — which ones are now unblocked?

What to observe: Are JSON files generated in the .tasks/ directory? After completing a task, are the blocked tasks unblocked?

What's Next

The task graph is in place. But some tasks take a long time — like running full test suites or deploying to a server. The agent calls the LLM billed by token, it can't afford to wait on a slow operation.

s13 Background Tasks → Slow operations go to the background. The agent continues processing other tasks, and gets notified when the background work is done.

Deep Dive into CC Source

The following is a complete analysis based on CC source code utils/tasks.ts (862 lines), tools/TaskCreateTool/TaskCreateTool.ts (138 lines), tools/TaskUpdateTool/TaskUpdateTool.ts (406 lines), tools/TaskGetTool/TaskGetTool.ts (128 lines), tools/TaskListTool/TaskListTool.ts (116 lines), hooks/useTaskListWatcher.ts (221 lines).

1. TaskRecord's Full Fields

The tutorial only covers id, subject, status, owner, blockedBy. CC actually has 9 fields (utils/tasks.ts:76-89):

Field	Type	Purpose
`id`	string	Incrementing integer ID
`subject`	string	Short title
`description`	string	Free-form description
`activeForm`	string?	Present tense form, shown in spinner when in_progress
`owner`	string?	Assigned agent ID
`status`	pending/in_progress/completed	Lifecycle
`blocks`	string[]	Task IDs blocked by this task (downstream)
`blockedBy`	string[]	Task IDs blocking this task (upstream)
`metadata`	Record?	Arbitrary extension key-value pairs

Storage location: ~/.claude/tasks/{taskListId}/{id}.json. One file per task.

2. Not a TodoWrite Upgrade — Two Independent Systems

In CC, Task System and TodoWrite coexist, toggled by isTodoV2Enabled() (utils/tasks.ts:133) — interactive sessions default to Task (V2), non-interactive/SDK sessions default to TodoWrite. The CLAUDE_CODE_ENABLE_TASKS env var can force-enable Task. Task has what TodoWrite lacks: file-lock concurrency protection, dependency enforcement, ownership, fs.watch reactive monitoring, lifecycle hooks.

3. Concurrent Claim Locking

claimTask() (utils/tasks.ts:541-612) uses dual locking to prevent races:

Task file lock: proper-lockfile locks {taskId}.json (up to 30 retries, exponential backoff 5-100ms). Inside the lock:

Re-read task (prevent TOCTOU)
Check already claimed by another → already_claimed
Check already completed → already_resolved
Check upstream not completed → blocked
Set owner

List-level lock (agent busy check): .lock file, atomic scan of all tasks to check if the agent already has other open tasks.

Note: The teaching version combines claiming and starting work into one step (claim = set owner + in_progress); real CC's claimTask primarily resolves owner competition — it only sets owner without changing status. Status updates are handled by TaskUpdate.

4. High-Water Mark to Prevent ID Reuse

The .highwatermark file records the highest task ID ever assigned. Even if a task is deleted, its ID won't be reused.

5. Four Task Tools

CC's task system has four tools (not the tutorial's single generic Task tool): TaskCreate, TaskGet, TaskUpdate, TaskList. All set isConcurrencySafe: true and shouldDefer: true (tool schemas aren't in the initial prompt; only visible after ToolSearch).

The teaching version's create_task(blockedBy=...) declares dependencies at creation time, which is a reasonable simplification. Real CC's TaskCreate only accepts subject/description/activeForm/metadata — dependencies are maintained via TaskUpdate's addBlocks/addBlockedBy.

#s12: Task System — Break Big Goals into Small Tasks

#The Problem

#The Solution

#How It Works

#Task: Data Structure

#create_task: Create Tasks

#can_start: Dependency Check

#claim_task: Claim a Task

#complete_task: Complete and Unblock

#get_task: View Full Details

#State Machine: Two Actions, Three States

#Putting It Together

#Changes from s11

#Try It

#What's Next

#1. TaskRecord's Full Fields

#2. Not a TodoWrite Upgrade — Two Independent Systems

#3. Concurrent Claim Locking

#4. High-Water Mark to Prevent ID Reuse

#5. Four Task Tools