Claude Fable 5 Is Here — What Changes for Developers

Anthropic just released Claude Fable 5 — the first model in the Claude 5 family, and the first in a new “Mythos-class” tier that sits above Opus. It’s their most intelligent model you can actually get your hands on.

If you build with the Claude API, this is not just a model ID swap. Fable 5 changes how thinking works, how tokens are counted, and how refusals come back. Some of your existing code will straight up return 400 errors.

Let’s go through everything that matters, so you don’t find out the hard way.

What Exactly Is Fable 5?

For years, the Claude lineup was simple — Haiku for cheap and fast, Sonnet for balance, Opus for the heavy stuff. Fable 5 adds a new floor above Opus.

There are actually two names you’ll hear:

Claude Fable 5 (claude-fable-5) — the generally available model. This is the one you and I can use. It ships with additional safety measures for dual-use capabilities.
Claude Mythos 5 (claude-mythos-5) — the same underlying model, without those measures, available only to approved organizations.

Same brain, different access levels. For almost everyone reading this, Fable 5 is the model. Anthropic’s announcement is here if you want the official word.

The Numbers

Here’s how Fable 5 stacks up against the current lineup:

Model	Model ID	Context	Max Output	Input $/1M	Output $/1M
Claude Fable 5	`claude-fable-5`	1M	128K	$10.00	$50.00
Claude Opus 4.8	`claude-opus-4-8`	1M	128K	$5.00	$25.00
Claude Sonnet 4.6	`claude-sonnet-4-6`	1M	64K	$3.00	$15.00
Claude Haiku 4.5	`claude-haiku-4-5`	200K	64K	$1.00	$5.00

Two things jump out.

First, it’s 2x the price of Opus 4.8. This is not your default model for every request. It’s for the work Opus can’t handle.

Second, the 1M context window is the default, not an opt-in. You don’t need a beta header or a long-context tier. Send a million tokens, it just works.

Thinking Is Always On — You Can’t Turn It Off

This is the biggest API change. On every previous model, extended thinking was something you enabled. On Fable 5, thinking is always on. You don’t configure it — you just leave the thinking parameter out entirely.

And here’s the gotcha: explicitly disabling it is an error.

// ❌ This returns a 400 on Fable 5
const response = await client.messages.create({
  model: "claude-fable-5",
  max_tokens: 16000,
  thinking: { type: "disabled" }, 
  messages: [{ role: "user", content: "Plan this migration..." }],
});

// ✅ Just omit thinking entirely
const response = await client.messages.create({
  model: "claude-fable-5",
  max_tokens: 16000,
  output_config: { effort: "high" }, 
  messages: [{ role: "user", content: "Plan this migration..." }],
});

The old budget_tokens style is also gone — that already returned a 400 on Opus 4.7 and 4.8, and Fable 5 continues that.

So how do you control how much it thinks? The effort parameter. It goes from low to medium, high, xhigh, and max. Think of Fable 5 like a senior engineer who refuses to answer without thinking first. You can’t tell them “don’t think” — but you can tell them how deep to go.

One more thing — Fable 5 has protected thinking. The raw chain of thought is never returned to you, on any setting. By default the thinking blocks come back empty. If you want a readable summary of the reasoning, ask for it:

thinking: { type: "adaptive", display: "summarized" }

The New Tokenizer Will Change Your Bill

Fable 5 uses a new tokenizer. The same content tokenizes to roughly 30% more tokens than on Opus-tier models.

Think of it like getting a new electricity meter that counts units differently. Your appliances didn’t change, your usage didn’t change — but the bill is calculated on a new scale. Every token budget you tuned on Opus is now wrong: max_tokens values, context-window math, cost estimates, all of it.

Don’t guess the new numbers. Measure them. The token counting endpoint returns counts under both tokenizers when you pass Fable 5 as the model:

const count = await client.messages.countTokens({
  model: "claude-fable-5",
  messages: [{ role: "user", content: bigPrompt }],
});

console.log(count.input_tokens); // new tokenizer — what you'll be billed

The response also includes the same request counted under the prior-generation tokenizer, so you can see the exact delta on your own prompts before committing.

New `refusal` Stop Reason — Check It Before Reading Content

Fable 5 runs safety classifiers on requests, mainly targeting research biology and most cybersecurity content. When a classifier declines a request, you don’t get an error. You get a successful HTTP 200 with stop_reason: "refusal" — and possibly an empty content array.

That means this innocent-looking line is now a bug:

console.log(response.content[0].text); // 💥 crashes on a refusal

Always branch on stop_reason first:

const response = await client.messages.create({
  model: "claude-fable-5",
  max_tokens: 16000,
  messages: [{ role: "user", content: userQuery }],
});

if (response.stop_reason === "refusal") {
  // Pre-output refusal: content is empty and you're not billed.
  // Mid-stream refusal: discard the partial output.
  handleRefusal(response.stop_details?.category); // "cyber" | "bio" | null
} else {
  // safe to read content
}

Benign work near those domains — security tooling, life-sciences code — can occasionally trigger false positives. For that, the API has a beta fallbacks parameter: name claude-opus-4-8 as a fallback, and on a refusal the API retries on Opus server-side in the same round trip. One request, automatic recovery.

The Smaller Gotchas

A quick list of things that will 400 or surprise you:

No assistant prefill. Ending your messages with an assistant turn to force output shape returns a 400. Use structured outputs (output_config.format) instead — same as the 4.6+ family.
No temperature, top_p, or top_k. Sampling parameters are gone, carried over from Opus 4.7/4.8. Steer with prompting.
30-day data retention is required. Fable 5 is not available under zero data retention. If your org is on ZDR, every single request returns a 400 invalid_request_error — even with a perfect payload. If you’re debugging a mysterious 400, check your org’s retention setting before tearing apart your request body.
Longer turns. A single request on a hard task can run for many minutes — that’s normal, not a hang. Plan your timeouts and use streaming.

What Is It Actually Good At?

Fable 5’s gains are mostly on work above what older models could do — long-horizon agentic runs that go for hours without correction, first-shot implementations of well-specified systems, and coordinating parallel sub-agents reliably.

If your workload is “summarize this email” or “classify this ticket”, Fable 5 is overkill. Sonnet or Haiku will do it for a fraction of the price. Fable 5 earns its cost when the task is long, ambiguous, and multi-step — the kind of thing where a cheaper model needs three retries and a human babysitter.

So, Should You Switch?

Here’s the simple decision:

Daily coding, agents, general API work — stay on Opus 4.8. It’s half the price and still excellent.
Speed and cost-sensitive pipelines — Sonnet 4.6 or Haiku 4.5, as always.
Your hardest unsolved problems — overnight autonomous runs, massive refactors, deep multi-step research — that’s Fable 5’s territory.

And if you do switch, remember the checklist: remove all thinking config, drop sampling params, kill assistant prefills, handle stop_reason: "refusal", re-measure your token budgets, and confirm your org meets the data retention requirement.

Do that, and the migration is smooth. Skip it, and you’ll be staring at 400 errors wondering what happened. Now you know.