Tier 2

Tier 2 — Shaping & Slicing: Plan and slice before you build, to keep every change small and safe

Reach for this tier when big asks go sideways — the agent edits the wrong things or tries to do everything in one pass. Plan first, then cut the work into small slices, each one small enough to check and cheap to undo.

15. Investigate before you edit — a wrong mental model wrecks the diff.

Instead of: "Add OAuth." (straight to code)

Prefer: "Explore @src/auth, explain how it works, change nothing." → then "now implement OAuth."

16. Plan the uncertain; skip the plan for the trivial.

Instead of: a 6-step plan to rename a variable.

Prefer: plan multi-file or unfamiliar work; do one-sentence diffs directly.

17. Force an approval checkpoint and see what it will touch — nothing irreversible runs unseen.

Instead of: letting it touch 30 files unsupervised.

Prefer: "Numbered plan, risk per step, exact files to be created/modified/deleted. STOP for approval."

18. Ask for options, not the first idea.

Instead of: accepting plan #1.

Prefer: "Give me 2–3 approaches with a one-line trade-off each; I'll choose."

19. Slice vertically — thin end-to-end increments, not horizontal layers.

Instead of: "Build the whole course-catalog feature."

Prefer: "Slice 1: a user searches courses by department and sees results — UI → API → DB → test. Ship it green before slice 2."

A vertical slice is narrow, end-to-end, independently testable. It keeps each agent task inside the context window, makes dependencies explicit, and gives a working checkpoint after every slice instead of one giant unverifiable diff.

flowchart LR
  S1[Slice 1<br/>UI→API→DB→test ✓] --> S2[Slice 2<br/>UI→API→DB→test ✓] --> S3[Slice 3<br/>...]

20. Turn big features into a spec, then a fresh session.

Instead of: a 200-message thread that drifts.

Prefer: "Interview me on the hard parts, write SPEC.md" → new session: "Implement SPEC.md."

How: shape the behavior into the spec itself — a Story, Given/When/Then scenarios (.feature), and Non-goals — so "done" is concrete before any code. Keep scenarios crisp (one behavior each, concrete Then steps):

# Spec — refund eligibility
## Story
As a customer, I want to request a refund within 30 days so I'm not stuck with a bad purchase.
## Scenarios (.feature)
Scenario: Refund allowed inside the window
  Given an order placed 10 days ago
  When the customer requests a refund
  Then the refund is approved and the amount is credited
Scenario: Refund refused outside the window
  Given an order placed 45 days ago
  When the customer requests a refund
  Then the request is rejected with reason "window_expired"
## Non-goals
- Partial refunds (separate spec).

This is the shape; Tier 4 turns these scenarios into the loop's executable check (Tip 33).

Your first multi-model setup (the basics)

Two moves, no infrastructure — this is all the multi-model you need until you're orchestrating real work (the rest is the full playbook before Tier 6).

21. Plan with the smart model, build with the cheap one.

Instead of: running planning and coding on the same single model.

Prefer: /model opusplan — Opus reasons through the plan, Sonnet writes the code. One command, top-tier planning, without paying Opus rates on every edit.

How: type /model opusplan (or set "model": "opusplan" in .claude/settings.json). Switch by hand any time: /model opus for hard thinking, /model sonnet to implement, /model haiku for cheap bulk work, /model best for the current top of the ladder (Fable 5 where available, else latest Opus). Aliases auto-update to the recommended version; pin a full name like claude-opus-4-8 when you need it fixed.

That's the whole basic setup — two Anthropic models, one flag, zero infra.

22. Have one agent draft the spec and a different one critique it — before any code.

Instead of: one model writing SPEC.md and you trusting it.

Prefer: let one agent draft the spec, then hand it to a different model to attack: "You're the reviewer. Find gaps, wrong assumptions, missing edge cases, and unstated requirements in this spec. List them; don't rewrite." Feed the findings back to the author to revise. One or two rounds and the spec is far harder than a single model produces.

How: quickest is two chat tabs (Claude drafts, ChatGPT/Gemini critiques, repeat).

Inside Claude Code, make it a loop: a spec-author subagent writes SPEC.md and a spec-critic subagent (different model:) reviews it — the critic's different blind spots are the point.

The spec is the contract the whole build runs against (Tier 4), so this is the highest-leverage place to spend a second model.

That's the basics: one model plans while a cheaper one builds, and agents draft and critique the spec against each other before code. Per-subagent models, an advisor, cross-lab code review, and open-weight/self-hosted routing come later in the full playbook.