March 9, 202610 MIN READ

Chain-of-Thought & Self-Consistency: Advanced AI Reasoning

By Dorian Laurenceau

Part ofModule 3 — Chain-of-Thought & Reasoning→

📅 Last reviewed: April 24, 2026. Updated with April 2026 findings and community feedback.

Chain-of-Thought and Self-Consistency: Advanced AI Reasoning

LLMs are powerful but they think in shortcuts. When asked a complex question, they often jump to an answer without showing, or performing, the intermediate reasoning steps. Chain-of-Thought (CoT) prompting forces the model to think step by step, and Self-Consistency takes this further by running multiple reasoning paths and picking the best answer.

CoT in 2026: what still works, what got absorbed into reasoning models

Chain-of-Thought was introduced in Wei et al. 2022 ("Chain-of-Thought Prompting Elicits Reasoning in Large Language Models") and became the defining prompting technique of the first LLM wave. Four years later, the practitioner view on r/MachineLearning, r/LocalLLaMA, and r/ChatGPTPro is that CoT has bifurcated: partially obsolete for frontier reasoning models, still essential everywhere else.

What changed with reasoning models:

→o-series, GPT-5 Thinking, Claude extended thinking, Gemini 2.5 Pro reasoning do CoT internally. You don't need to prompt "think step by step" — the model already does it, at scale, and hides the chain from the output. OpenAI's o1 announcement was explicit: the technique moved from prompt to architecture.
→On reasoning models, explicit CoT prompts can hurt. Adding "think step by step" to a reasoning model sometimes anchors the internal chain to a worse path. Let the model do its thing.
→On non-reasoning models, CoT is still the single biggest accuracy lever. For Claude Haiku, GPT-4o mini, Gemini Flash, Llama 3, and fine-tuned open models, explicit CoT still produces the 30-60 % gains the 2022 paper reported.

What still works as practitioners have learned:

→"Let's think step by step" vs. structured CoT. The naive zero-shot trigger (from Kojima et al. 2022) helps, but structured CoT ("First identify what's asked; then list known facts; then compute; then verify") outperforms it consistently.
→Self-consistency amplifies CoT on small models. Sample 5-10 chains at T=0.7, take the majority. The cost/quality tradeoff is best for cheap models on hard tasks.
→CoT fails on tasks without verifiable answers. Majority voting and chain-checking assume a correct answer exists. For open-ended generation, CoT mostly produces longer outputs without quality gains.
→Verifier-based selection beats voting. When possible, run code, check types, query a database. External verification is always better than LLM-majority.

Common 2026 mistakes:

→Using CoT on reasoning models without testing. It can help, hurt, or do nothing. Benchmark on your actual task.
→Hiding the chain from users who need auditability. For regulated domains, the chain is the audit trail. Don't discard it.
→Mistaking longer output for better reasoning. More tokens aren't more thought. Chains that restate the prompt add latency without gain.

The honest framing: Chain-of-Thought is the most successful prompting technique in LLM history, and in 2026 its role has moved. For frontier reasoning models, it's architectural; for everything else, it's still the default way to extract correct multi-step reasoning. Don't assume one-size-fits-all; benchmark both approaches on your task before committing.

Why Models Need Help Reasoning

LLMs predict the next token, they do not "reason" in the human sense. For simple questions, direct prediction works fine. But for multi-step problems (math, logic, analysis), the model needs to lay out intermediate steps to arrive at the correct answer.

Think of it this way: if someone asks you "What is 47 times 83?", you do not instantly produce "3,901." You decompose: 47 times 80 = 3,760, plus 47 times 3 = 141, total = 3,901. Chain-of-Thought forces the model to decompose in the same way.

The Three CoT Techniques

Zero-Shot CoT: The Magic Words

Few-Shot CoT: Teaching Reasoning by Example

Self-Consistency: Voting for the Best Answer

When CoT Fails

Test Your Understanding

Further Exploration

You have mastered Chain-of-Thought and Self-Consistency. In the next article, you will explore Tree-of-Thought, a technique that lets the model explore and backtrack through branching reasoning paths, solving problems that linear reasoning cannot.

Continue to Tree-of-Thought Reasoning Arena to go beyond linear thinking.

GO DEEPER — FREE GUIDE

Module 3 — Chain-of-Thought & Reasoning

Master advanced reasoning techniques and Self-Consistency methods.

Explore the Module

Dorian Laurenceau

Full-Stack Developer & Learning Designer

Full-stack web developer and learning designer. I spent 4 years as a freelance full-stack developer and 4 years teaching React, JavaScript, HTML/CSS and WordPress to adult learners. Today I design learning paths in web development and AI, grounded in learning science. I founded learn-prompting.fr to make AI practical and accessible, and built the Bluff app to gamify political transparency.

Prompt EngineeringLLMsFull-Stack DevelopmentLearning DesignReact

Published: March 9, 2026Updated: April 24, 2026

Newsletter

Weekly AI Insights

Tools, techniques & news — curated for AI practitioners. Free, no spam.

Free, no spam. Unsubscribe anytime.

FAQ

What will I learn in this Advanced Reasoning guide?+

Master Chain-of-Thought (CoT) prompting and Self-Consistency techniques to dramatically improve AI reasoning. Includes zero-shot CoT, few-shot examples, and voting strategies.