Plain English for
Every Term.
If you ever hit a term in our docs or workshops you do not recognize, it is here. Operator language without the jargon tax.
If you ever hit a term in our docs or workshops you do not recognize, it is here. Operator language without the jargon tax.
Agent
Also: AI agent, autonomous agent
A program that takes a goal in plain English, decides what steps to perform, calls tools to do them, observes the result, and decides what to do next. Different from a chatbot in that an agent actually performs work (writes files, hits APIs, deploys) rather than only producing text.
Agent orchestration
Coordinating multiple agents to work together. One agent might handle research while another writes code while a third runs tests, with a coordinator agent splitting the task and reading back results. The pattern matters because solo agents hit context limits on big tasks; orchestrated ones do not.
AGENTS.md
See also: CLAUDE.md
A configuration file Codex and other agents read at the start of a session to learn rules, conventions, and project-specific context. Conceptually similar to CLAUDE.md but used by different tools. The two files often coexist in the same project root.
API
Application Programming Interface
A way for software to talk to other software. When we say "your agent calls the Stripe API," we mean it makes a network request to Stripe and gets data back. APIs are the verbs your agent uses to do real work.
Claude Code
Anthropic's coding agent CLI
A command-line tool from Anthropic that runs Claude as a coding agent in your terminal. Reads your files, writes new ones, runs commands, asks for permission before risky operations. We teach it as one of two recommended primary stacks.
CLAUDE.md
Project config for Claude Code
A markdown file you place at the root of any project. Claude Code reads it at session start and treats the contents as rules, conventions, and context. Think of it as onboarding for the agent. The same pattern works in Cursor and other tools that respect project-level instructions.
Codex
OpenAI's coding agent CLI
OpenAI's coding-focused agent that runs in the terminal. Similar role to Claude Code: reads files, writes code, runs commands, asks permission for destructive operations. We teach it as the second of two co-primary stacks.
Context window
Context length
The maximum amount of text a model can hold in active memory at once, measured in tokens (roughly 0.75 words per token). When you exceed it, the model starts forgetting earlier parts of the conversation. Most modern agents have 100k-1M token windows. Knowing this limit matters because long-running agent sessions need strategies to stay inside it.
EPEV
Explore, Plan, Execute, Verify
A 4-step ritual for any change that touches more than one file. Explore the codebase, plan the change, execute it step by step, verify it works before claiming done. The opposite of "vibe coding" where you start typing and hope.
Hook
PreToolUse hook, PostToolUse hook
A script that runs automatically before or after an agent uses a tool. Common use: block dangerous commands like rm -rf before they execute. The hook can approve, reject, or modify what the agent was about to do. Configured in .claude/settings.json for Claude Code.
LLM
Large Language Model
The underlying model that powers an agent. Claude, GPT, Gemini, Llama, Qwen are all LLMs. The agent wraps the LLM with tools, memory, and a decision loop, but the LLM is the engine that actually thinks.
Local LLM
On-device model
A model that runs on your own hardware instead of a cloud API. Tools like Ollama let you run DeepSeek, Qwen, Llama, or others locally. Tradeoffs: cheaper and private but slower and capped by your GPU VRAM.
MCP
Model Context Protocol
An open protocol from Anthropic that defines how AI agents talk to external tools and data sources. An MCP server exposes a set of tools (functions the agent can call) and resources (data the agent can read). Lets you give your agent access to your Gmail, Notion, Stripe, custom databases, anything you have an API for.
MCP server
A small program that implements MCP. Talks to one specific external system (Gmail, Stripe, a custom database) and exposes its functionality as tools the agent can call. You can write your own MCP server in TypeScript or Python in an hour.
Operator
AI agent operator, CAIAO
Our preferred word for the human directing the agents. Not a "user," not a "prompt engineer," not a developer. An operator decides what gets built, sets the standards, reviews the output, and keeps the agents on task. The whole site is designed around making you a better operator.
Prompt
The instructions you give an agent. Bad prompts get bad results; good prompts get production-ready output. Our prompt library has 215 examples covering everything from inbox triage to multi-agent coordination.
Prompt engineering
The discipline of writing prompts that consistently get the result you want. Less of a skill than it used to be (modern models are forgiving), more of a habit of being specific about constraints, format, and success criteria.
RAG
Retrieval-Augmented Generation
A pattern where you embed a body of documents into a vector database, then at query time the agent pulls the most relevant chunks and includes them in its context before answering. Lets agents answer questions from your private knowledge base without having to fit it all in the context window. Tools: Qdrant, Chroma, pgvector.
Skill
A reusable bundle of agent instructions plus optional code that the agent can invoke on demand. Like a function in normal programming but written in natural language plus tool calls. Our skills library has plug-and-play examples.
Token
The unit LLMs count text in. Roughly 0.75 words per token in English. When a pricing page says "$3 per million input tokens," that is the per-token cost of running your prompt through the model. Knowing token counts matters because they determine cost and context limits.
Tool calling
Function calling
When an agent decides to use one of the tools available to it (web search, file write, API call) instead of answering with plain text. Modern models do this autonomously based on the prompt and available tools.
Vector database
See also: RAG
A database optimized for storing embeddings (vectors that represent the meaning of text) and finding the most similar ones to a query. The retrieval half of RAG. Examples: Qdrant, Pinecone, Chroma, pgvector.
VRAM
Video RAM, GPU memory
The memory on your graphics card. Relevant for local LLMs because the model has to fit in VRAM to run fast. An M3 Pro with 18GB unified memory can hold a 14B parameter model comfortably; a 70B model needs 40GB or you stream it slowly.
Did we miss one?
If you hit a term in our materials that is not on this page, email hello@express-intent.com and we will add it. Plain English wins.