Claude 4.7 vs GPT-5: an honest 2026 comparison

Side-by-side answers on twelve real prompts — coding, writing, research, and reasoning. Where each model wins, where they tie, and how to choose.

Marcie Ellis avatar
Marcie Ellis
Content Marketer
3 min read
two stylised speech bubbles facing each other labelled Claude and GPT

The "which AI is better, Claude or GPT?" question gets asked a lot, and most answers either (a) cite a vendor benchmark or (b) say "it depends" without committing. We tested Claude 4.7 and GPT-5 head-to-head on twelve real prompts in May 2026 — four coding, four writing, two research, two reasoning — and scored both the answer and the workflow. The honest version is that each model wins about half the categories, and the right answer is a task-by-task split, not "pick one for life".

The verdict, in one table

CategoryWinnerMargin
Coding (function-level)GPT-5small
Coding (whole-file refactor)Claude 4.7medium
Writing (long-form essay)Claude 4.7medium
Writing (marketing copy)GPT-5small
Research (citations)tie
Reasoning (math)GPT-5small
Reasoning (legal/structured)Claude 4.7medium
Multimodal (vision)GPT-5small
Long-context (200K+ tokens)Claude 4.7large
Tool use / agenticGPT-5small
Following instructions exactlyClaude 4.7medium
Conversational warmthClaude 4.7small

GPT-5 wins six, Claude 4.7 wins six (counting the long-context win that Claude takes by a large margin). One tie.

Where Claude 4.7 actually beats GPT-5

Long documents. Anthropic's 200K-token context is still the differentiator. We fed a 70-page PDF to both. Claude 4.7 referenced details from page 60 accurately in its summary; GPT-5 retrieved correctly from page 30 but started hallucinating quotes from page 60. The gap is biggest on documents larger than 50 pages.

Whole-file refactor. When the task is "here's a 400-line module, restructure it into three smaller files following the existing patterns", Claude 4.7 keeps the existing patterns more faithfully. GPT-5 will sometimes invent a new pattern that's marginally cleaner but not what you asked for.

Following exact instructions. "Respond in three short paragraphs, no bullets, no headers" — Claude obeys, GPT-5 sometimes adds a header anyway. This matters more than it sounds; for any structured output, Claude needs less correcting.

Where GPT-5 actually beats Claude 4.7

Tool use and agentic loops. GPT-5 plans multi-step tool sequences with less hand-holding. Claude 4.7 is competent at tool use but tends to verbalize its plan more, which slows agent workflows.

Marketing copy. A consistent pattern: GPT-5's first draft of an ad headline or product description is closer to "ready to ship" than Claude 4.7's. Claude tends toward measured prose; GPT-5 tends toward punchy.

Multimodal output. Vision + image generation are first-class in GPT-5. Claude 4.7 has strong vision input but no native image generation.

Where they tie

Research with citations: both models cite real sources at similar rates and both still hallucinate one source per ten on hard queries. Don't trust either for citations without verifying.

The right way to choose

For most readers, the answer is to use both — without paying for both subscriptions. That's exactly what multi-model platforms exist for. With oran.chat, one $20/mo subscription gets you Claude 4.7, GPT-5, Gemini 2.5 Pro, and three more. You pick the model per question — long doc to Claude, marketing headline to GPT, code refactor to Claude, vision task to GPT — without juggling two subscriptions or two browser tabs.

If you only want one: pick based on what you do most. Writers and analysts → Claude 4.7. Builders and marketers → GPT-5.

What's next

We're updating this post in November 2026 with the same methodology against whatever ships next. For more head-to-heads — Claude vs Gemini, ChatGPT Plus vs Claude Pro spend journals, multi-model platform shootouts — see the rest of Comparisons.