Glossary

RAG (Retrieval-Augmented Generation)

Retrieval-augmented generation (RAG) is an architecture pattern that connects a large language model to external knowledge sources—documents, databases, APIs—so its responses draw on real, current information rather than relying solely on what it memorized during training. The model retrieves relevant context at query time, then generates an answer grounded in that evidence. RAG is one way enterprises make general-purpose AI useful for their specific business, though the approach is not without its limitations and has been surpassed by simpler read/write/grep tools in many instances.

Related terms:

AI Agent

An AI agent is a system that autonomously breaks a goal into steps—calling tools, reading results, and adjusting course—without waiting for a human prompt. While powerful for tasks with clear success criteria, agents can be dangerous when goals are vague or environments unfamiliar and typically need tight guardrails in production.

System Prompt

A system prompt is an invisible set of instructions given to a language model—defining its persona, constraints, output format, and behavioral rules—and occupies the “system” role in APIs like OpenAI and Anthropic. It shapes every response by encoding business logic and is the most efficient way to control model behavior.

Structured Output

Structured output occurs when a language model returns data in predictable, machine-readable formats—such as JSON, XML, or typed objects—rather than free-form prose, enabling software systems to reliably parse fields like names, dates, and dollar amounts. By using constrained generation to enforce a JSON schema, structured output transforms AI from a conversational interface into a dependable system component.