I hit Claude's usage limit three times in one day last week.

Not a huge research project. Not a coding sprint. Normal work. A few Cowork sessions. Some writing. Some debugging in Claude Code. Then at 3pm, the limit screen.

I sat down and check what was actually eating my tokens. The math was uglier than I expected.

Here's the part most people miss: every message you send to Claude re-reads the full conversation before it responds. Message 1 costs a few thousand tokens. Message 30 costs roughly 100,000 tokens, because Claude re-reads the 29 messages before it even starts thinking about the new one.

Your conversation is a bank account that gets more expensive every time you dip in.

Most habits in this guide come back to that single idea: stop paying for context you aren't using.

I split this into two parts. Cowork is the chat side. Claude Code is the terminal side. Both burn tokens. Both have specific fixes.

First, the framework.

Two numbers run your life with Claude:

  1. Context window. Roughly 200K tokens per conversation. Everything counts: your messages, Claude's replies, file uploads, tool outputs, system instructions.

  2. Usage limit. How much you can send across all conversations in a 5-hour window (plan-dependent).

The mental model: context is like RAM on your computer. You wouldn't load every file on your drive into memory to edit one document. You load what's needed, release what isn't.

Most Claude users work the opposite way. One chat for a week. Same PDF dropped into five sessions. Extended Thinking on by default. Every connector enabled at all times.

All of that compounds. And here's how to reverse it.

Part 1: Cowork

1. Pick the right model before you start.

You don't drive a semi-truck to the grocery store.

Claude has three tiers. Haiku is fast and cheap. Sonnet is the daily workhorse. Opus is the deep reasoning model, roughly 5x the cost of Sonnet.

Default rule: start on Sonnet. Escalate to Opus only when you hit something that actually needs it.

When Opus earns its cost:

→ Multi-step architectural decisions.

→ Complex financial, legal, or scientific reasoning.

→ Writing that needs real nuance (scripts, speeches, long-form essays).

When Sonnet handles it fine:

→ Drafting emails, posts, summaries.

→ Research recaps.

→ Code that isn't architecturally tricky.

→ Any single-step task.

When Haiku is the right pick:

→ File searches.

→ Simple lookups.

→ Formatting and cleanup.

→ Anything where the answer takes Claude under 30 seconds.

Model switching is one click. Make it reflex.

2. Kill PDFs before you upload them.

This is the single biggest token trap in Cowork.

Keep Reading