Guide

From Prompt to Context: Getting More Out of AI

March 18, 202615 min read
From Prompt to Context: Getting More Out of AI

Most people use AI like a search engine. Type a question, hope for a decent answer. Sometimes it works. Usually not well enough.

Meanwhile, teams that know what they're doing build entire systems on top of the same models. The difference isn't the model. It's what you feed the model. The instruction, the structure, the context around it.

Over the past two years, the field has shifted from "prompt engineering" -- carefully wording a question -- to what's now called context engineering: designing everything a model sees before it generates a response. System prompt, tools, documents, memory, conversation history. The prompt is just one piece of a bigger picture.

This article walks through the key techniques, from basics to production. Each with examples you can apply tomorrow.

The basics nobody gets right

The most common mistake is vagueness. "Write something about customer satisfaction" gets you a generic piece. Not because the model is bad, but because it has too little information to produce anything better.

Anthropic calls this the "new employee" test: if you handed your prompt to a smart colleague who just started and has zero context about your company, would they know exactly what to do? If the answer is no, the model has the same problem.

Vague

Write something about customer satisfaction.

Specific

Write a professional 150-word email to a customer who filed a complaint last week about a late delivery. Thank them for the feedback, apologize, and offer a one-time 10% discount on their next order. Tone: warm but businesslike.

The difference isn't subtle. The first prompt gives the model free rein. The second defines exactly what's needed: format, length, audience, content, tone. The more specific you are, the less the model has to guess.

Difference between vague and specific AI prompts
Specific prompts give the model enough context to produce a usable result.

Rules of thumb: always specify the desired format (email, list, JSON, summary). Name the audience. Include constraints (length, tone, what to avoid). And if the task has multiple steps, number them.

Structure: the prompt as a contract

Once your prompts get complex, stuffing everything into a paragraph isn't enough. You need structure. Think of a prompt as a contract: it has clear sections for the role, the instruction, the data, and the expected output.

The most effective way to do this is with clear separation between parts. Anthropic recommends XML tags to separate instructions, context, and data. But the principle works with any model: make it explicit where the instruction ends and the data begins.

Role: You are an experienced financial analyst.

Instruction: Analyze the quarterly figures below
and provide a summary of no more than 200 words.
Focus on revenue growth, margins, and risks.

Context:
- Company: TechCorp B.V.
- Period: Q4 2025
- Sector: SaaS

Data:
[Paste quarterly figures here]

Desired format:
- Summary (max 200 words)
- Top 3 risks (numbered)
- Recommendation (1 sentence)

Four clear blocks: role, instruction, context, output format. The model knows exactly what to do, with what data, and what the result should look like. No room for interpretation.

Role assignment deserves extra attention. "You are an experienced financial analyst" does more than you'd think. It activates a specific register of knowledge and style in the model. A legal advisor writes differently than a marketer. Always name the expertise you need.

If you work with APIs: use the system prompt for the role and fixed instructions, and the user prompt for variable data. That separates configuration from input and makes your prompts reusable.

Few-shot and chain-of-thought

Sometimes explaining isn't enough. You have to show. That's the principle behind few-shot prompting: you give two or three examples of the desired input and output, and the model learns the pattern.

Classify the customer reviews below as
Positive, Neutral, or Negative.

Review: "Great product, does exactly what I needed."
Classification: Positive

Review: "Delivery took long, but the product is fine."
Classification: Neutral

Review: "Doesn't work. Asked for a refund."
Classification: Negative

Review: "The color is different from the photo, but the
quality is good."
Classification:

Two to three examples are usually enough. More important than quantity is diversity: cover the edge cases. If all your examples are clearly positive or negative, the model won't know how to handle ambiguity.

Then chain-of-thought: having the model reason step by step before giving an answer. For complex tasks -- math, logic puzzles, data analysis -- this improves accuracy dramatically.

Note for 2026: reasoning models work differently

Models like OpenAI o3 and Claude Opus with adaptive thinking do chain-of-thought internally. They already reason step by step behind the scenes. Explicitly asking to "think step by step" can actually hurt performance with these models. OpenAI recommends using simpler prompts with reasoning models and letting the model determine its own thinking sequence. Use manual chain-of-thought only with standard models (GPT-4o, Claude Sonnet) that don't reason on their own.

Prompt chaining: from prompt to pipeline

A single prompt works fine for a single task. But when you want something more complex -- analyze a document, classify it, and trigger an action based on the result -- you hit limits. The solution: prompt chaining. You break a complex task into multiple steps, where the output of step 1 becomes the input for step 2.

The most common pattern is self-correction: generate, review, refine. Step 1 produces a first draft. Step 2 evaluates it against criteria. Step 3 improves based on the evaluation. Each step is a separate prompt (or API call), so you can inspect and adjust intermediate results.

Prompt chaining pipeline for invoice processing
A prompt chain splits complex tasks into manageable steps, each independently testable.

A real example from our projects: invoice processing. The chain looks like this:

  1. PDF arrives via email (trigger)
  2. AI model extracts supplier, amount, date, VAT (structured output)
  3. Validation against supplier list and accounting rules
  4. On match: create draft booking in the accounting system
  5. On mismatch: notification to the finance team

Each step has its own prompt, its own validation, and its own error handling. If step 2 can't extract a field, the invoice doesn't proceed to step 3 but goes to a manual queue instead. That's the difference between a demo and a production system.

For reliable data extraction in these pipelines, structured output is essential. Instead of free text, you ask the model to return JSON according to a fixed schema. Both OpenAI and Anthropic offer structured output APIs that enforce the schema, guaranteeing valid JSON in return.

Context engineering: the bigger picture

Everything above is about the prompt itself. But in 2026, the prompt is only a fraction of what determines how well an AI system performs. The rest is context.

Context engineering means designing everything the model sees: the system prompt, available tools, retrieved documents, memory from previous conversations, and conversation history. Not just the instruction, but the full picture.

Context engineering: all inputs an AI model receives
Context engineering covers the entire system, not just the prompt.

Why this matters: the LangChain State of Agent Engineering report shows that 57% of organizations now run AI agents in production. But 32% cite quality as the biggest obstacle. And those quality issues almost never come from the model itself. They come from poor context: the model is missing information, gets irrelevant documents, or lacks access to the right tools.

In practice, context engineering means:

  • RAG (Retrieval-Augmented Generation) -- retrieving relevant documents and feeding them to the model, instead of cramming everything into the prompt
  • Tool definitions -- telling the model which tools it can use (search, calculate, call APIs) and when
  • Memory -- retaining information from previous conversations so the model doesn't start from scratch every time
  • System prompts -- fixed instructions, behavioral rules, and domain knowledge that accompany every interaction

The difference between a chatbot that says "sorry, I don't know" and one that actually answers your question rarely comes down to the model. It comes down to whether someone took the time to make the right context available.

A concrete example: a customer service assistant. With just a prompt, you get generic answers. Add RAG with your knowledge base, tool access to your CRM, and memory of the ongoing conversation, and you get an assistant that knows the customer by name, can look up order history, and schedule a return. Same model. Completely different experience.

Start tomorrow

Three things you can apply right away:

  1. Test your prompts with the "new employee" test. Hand your prompt to someone without context. If they don't know what to do, rewrite the prompt.
  2. Structure complex prompts. Separate role, instruction, data, and desired format. Use the template from this article as a starting point.
  3. Think beyond the prompt. If you're using AI for a business process, ask yourself: what information is the model missing? What tools should it have? That's where the real difference lives.

Further reading:

Want to apply these techniques in your organization? Get in touch -- we help teams use AI structurally and effectively, not as a toy but as a tool.