Claude 4.7 vs GPT-5: an honest 2026 comparison
Side-by-side answers on twelve real prompts — coding, writing, research, and reasoning. Where each model wins, where they tie, and how to choose.
The "which AI is better, Claude or GPT?" question gets asked a lot, and most answers either (a) cite a vendor benchmark or (b) say "it depends" without committing. We tested Claude 4.7 and GPT-5 head-to-head on twelve real prompts in May 2026 — four coding, four writing, two research, two reasoning — and scored both the answer and the workflow. The honest version is that each model wins about half the categories, and the right answer is a task-by-task split, not "pick one for life".
The verdict, in one table
| Category | Winner | Margin |
|---|---|---|
| Coding (function-level) | GPT-5 | small |
| Coding (whole-file refactor) | Claude 4.7 | medium |
| Writing (long-form essay) | Claude 4.7 | medium |
| Writing (marketing copy) | GPT-5 | small |
| Research (citations) | tie | — |
| Reasoning (math) | GPT-5 | small |
| Reasoning (legal/structured) | Claude 4.7 | medium |
| Multimodal (vision) | GPT-5 | small |
| Long-context (200K+ tokens) | Claude 4.7 | large |
| Tool use / agentic | GPT-5 | small |
| Following instructions exactly | Claude 4.7 | medium |
| Conversational warmth | Claude 4.7 | small |
GPT-5 wins six, Claude 4.7 wins six (counting the long-context win that Claude takes by a large margin). One tie.
Where Claude 4.7 actually beats GPT-5
Long documents. Anthropic's 200K-token context is still the differentiator. We fed a 70-page PDF to both. Claude 4.7 referenced details from page 60 accurately in its summary; GPT-5 retrieved correctly from page 30 but started hallucinating quotes from page 60. The gap is biggest on documents larger than 50 pages.
Whole-file refactor. When the task is "here's a 400-line module, restructure it into three smaller files following the existing patterns", Claude 4.7 keeps the existing patterns more faithfully. GPT-5 will sometimes invent a new pattern that's marginally cleaner but not what you asked for.
Following exact instructions. "Respond in three short paragraphs, no bullets, no headers" — Claude obeys, GPT-5 sometimes adds a header anyway. This matters more than it sounds; for any structured output, Claude needs less correcting.
Where GPT-5 actually beats Claude 4.7
Tool use and agentic loops. GPT-5 plans multi-step tool sequences with less hand-holding. Claude 4.7 is competent at tool use but tends to verbalize its plan more, which slows agent workflows.
Marketing copy. A consistent pattern: GPT-5's first draft of an ad headline or product description is closer to "ready to ship" than Claude 4.7's. Claude tends toward measured prose; GPT-5 tends toward punchy.
Multimodal output. Vision + image generation are first-class in GPT-5. Claude 4.7 has strong vision input but no native image generation.
Where they tie
Research with citations: both models cite real sources at similar rates and both still hallucinate one source per ten on hard queries. Don't trust either for citations without verifying.
The right way to choose
For most readers, the answer is to use both — without paying for both subscriptions. That's exactly what multi-model platforms exist for. With oran.chat, one $20/mo subscription gets you Claude 4.7, GPT-5, Gemini 2.5 Pro, and three more. You pick the model per question — long doc to Claude, marketing headline to GPT, code refactor to Claude, vision task to GPT — without juggling two subscriptions or two browser tabs.
If you only want one: pick based on what you do most. Writers and analysts → Claude 4.7. Builders and marketers → GPT-5.
What's next
We're updating this post in November 2026 with the same methodology against whatever ships next. For more head-to-heads — Claude vs Gemini, ChatGPT Plus vs Claude Pro spend journals, multi-model platform shootouts — see the rest of Comparisons.