Prompts(2): Prompt Engineering Cookbook: Frameworks, Patterns, and Mistakes You’re Still Making

by JeariCk 7 min read
prompt engineering(2)

If you’ve ever stumbled across those “killer ChatGPT prompt secrets” posts, you probably noticed the same thing — other people’s prompts work like magic, but when you copy them verbatim, the output is still garbage.

Here’s the thing: you’re not bad at this. By 2026, prompt engineering has matured. The trapdoors have been mapped, the frameworks have been battle-tested, and there’s really no excuse for winging it anymore. This post cuts through the noise: techniques that actually work, frameworks you can grab and use right now, and the mistakes almost everyone is still making.

提示词工程(2)
提示词工程(2)

1. Prompt Techniques That Still Hold Up in 2026

Let’s run through the core techniques that keep delivering in production. No theory — just how to use them.

1.1 Chain-of-Thought (CoT)

If you can only learn one technique, make it this one.

CoT isn’t new (Google Research published the paper back in 2022), but it hasn’t lost any of its edge. The idea is dead simple: tell the model to write out its reasoning before giving the answer.

How to use it:

Zero-shot: Just tack on “Please reason step by step” at the end of your prompt. Google’s numbers show zero-shot CoT bumps arithmetic reasoning accuracy from 10.4% to 40.7%.

Few-shot: Include a couple of examples with full reasoning steps in your prompt. The model follows the pattern.

When to use: Math problems, logical reasoning, causal analysis, anything that needs multi-step derivation.

Caveat: CoT is overkill for simple tasks and burns extra tokens. Don’t use a flamethrower to light a candle.

1.2 Few-Shot Prompting

Show the model a handful of input-output pairs so it picks up the pattern you want. Google’s whitepaper is blunt about this: always include few-shot examples — zero-shot is not the recommended default.

There’s a classic experiment on GSM8K math benchmarks: 8 well-crafted examples let a 540B parameter model beat a fine-tuned GPT-3.

Key insight: Example quality beats quantity. 2-5 good examples outperform 10 filler ones.

1.3 Self-Consistency

CoT’s upgrade. The idea: run the same prompt multiple times (say, 5), each time taking a slightly different reasoning path, then pick the most frequent answer.

If you’re building high-stakes decision support systems, this is your friend. The downside? Cost multiplies by your sampling count.

1.4 Tree-of-Thought (ToT)

CoT walks a single line. Tree-of-Thought explores multiple reasoning paths simultaneously, evaluates them, and picks the best one.

Higher complexity, higher cost — but it shines on deep reasoning tasks. By 2026, most frontier models have this kind of capability baked in. You don’t even need to explicitly say “tree” in your prompt anymore; the model does something similar on its own.

1.5 ReAct (Reason + Act)

CoT only produces text — it doesn’t call tools. ReAct lets the model “think and do” — reason a step, call a tool, then reason again based on the result.

It’s the standard in agent systems. But if you’re just writing conversational prompts, you probably don’t need it.

1.6 Least-to-Most Prompting

Break a complex problem into sub-problems from easy to hard, solve them one by one, and use each answer to build toward the next.

Good for things you can’t handle in one shot but can crack step by step.


2. Prompt Frameworks You Can Use Right Now

Above were *techniques* — how to make the model think. Now here are *frameworks* — how to structure your prompt.

RTF (Role-Task-Format)

The simplest one, great for getting started fast:

```

Role: You are a XXXX

Task: Please complete XXXX

Format: Output format is XXXX

```

No frills. When you already know what role the model should play and just need to say what to do and how, RTF is enough.

RACE (Role-Action-Context-Expectation)

RTF plus context and expectations:

```

Role: You are a XXXX

Action: Please complete XXXX

Context: Background info is XXXX

Expectation: I expect the output to be XXXX

```

Good for content generation tasks.

COSTAR (Context-Objective-Style-Tone-Audience-Response)

Popularized by the Singapore government tech team. Most complete structure:

Context: Give enough background

Objective: Define the goal

Style: Writing style

Tone: Formal / casual

Audience: Target reader

Response: Output format

Best for external-facing content — marketing copy, client emails, formal reports.

RISE (Role-Input-Steps-Expectation)

Good for task-driven prompts:

```

Role: Define the role

Input: Input data

Steps: Execution steps

Expectation: Expected output

```

The 5-S Framework

Originally designed for education (Set the Scene, Specify task, Simplify language, Structure response, Share feedback), but works well in enterprise too.

How to choose: RTF for daily quick tasks, RACE or COSTAR for content output, RISE for task-driven work. There’s no “best framework” — only the one that fits.


3. Mistakes You’re Probably Still Making

Ranked by damage. The top ones hurt the most.

3.1 Stuffing Everything Into One Prompt

One prompt that asks for review, translation, formatting, summarization, and analysis all at once — guess what? None of it comes out well.

Models have limited attention. Cram ten tasks into one question, and you’ll get ten half-baked results. Split it up. One prompt, one job.

3.2 Padding With Fluff

Irrelevant information dilutes the model’s focus on what actually matters. The longer your prompt is — if most of it is noise — the worse your accuracy.

Audit your prompts: does every sentence earn its keep? No? Cut it.

3.3 Expecting the Model to Read Your Mind (No Examples)

Google’s whitepaper says it straight: zero-shot is not the preferred choice. Models keep getting better at zero-shot, but examples still beat no examples every time.

Especially when you need precise output formatting — without an example, the model will almost certainly invent its own format.

3.4 Making the Model Wear a Dozen Hats

“You are a 30-year veteran top-tier SEO expert, data scientist, copywriter, and心理咨询师” — which one is it supposed to be?

Be precise with your role definition. One prompt, one role. That’s enough.

3.5 Dumping Data First, Question Last

Background data first, then the actual question. The order matters more than you think. Google’s whitepaper explicitly says the question should come *after* the context data.

3.6 Assuming “Politeness” Works Across All Models

Different training datasets have different densities of politeness-related instructions. A magic phrase that works on GPT might do absolutely nothing on Claude. Don’t worship magic words. Test instead.

3.7 Shipping Without Testing

Works great in the demo. Crashes and burns in production. Why? The demo ran once — production runs hundreds of times, and variance catches up.

Right approach: prepare a test dataset, run the same prompt repeatedly, and measure consistency and accuracy.

3.8 Not Enough Context

The other extreme: your prompt is so short the model has no idea what you’re talking about. “Write a proposal” — what proposal? For who? In what style?

Give enough context. The model can’t read your mind.


4. Quick Self-Checklist

Run through this before you ship a prompt:

– [ ] Does this prompt do just one thing?

– [ ] Enough context but no fluff?

– [ ] If format matters, did you include an example?

– [ ] Single, precise role?

– [ ] Question placed after the data?

– [ ] Acceptance criteria defined?

– [ ] Tested across multiple runs in production?

– [ ] All filler words cut?

If you change only one thing, make it this: split your tasks. One prompt, one job. The quality lift will be bigger than any framework you learn.


Wrapping Up

Prompt engineering in 2026 is not a black art anymore. The techniques are a handful (CoT, Few-shot, Self-Consistency, ToT, ReAct). The frameworks are pick-one-and-go (RTF, RACE, COSTAR, RISE). And the mistakes — most people’s problems come down to the same root cause: not being precise enough.

Write your prompts like a spec document, not a chat message. Precision, not flashiness — that’s the real core of prompt engineering.


📖 Recommended Reading

Take a look at these articles; you might find them interesting

Leave a Reply

Your email address will not be published. Required fields are marked *