Glossary

Hallucination

Hallucination is when a language model generates text that sounds confident and plausible but is factually wrong—invented citations, fabricated statistics, nonexistent API endpoints. It happens because LLMs are not databases. They are pattern-completion engines that predict likely next tokens, and sometimes the likeliest continuation is a fluent lie. Hallucination rates vary by model, task, and domain: open-ended creative writing has different tolerances than legal research. Mitigation strategies include retrieval-augmented generation (grounding responses in source documents), chain-of-thought prompting (forcing the model to show its reasoning), and structured output validation. None of these eliminate hallucination entirely. Any system where an LLM's output reaches a customer, a contract, or a database without human review or automated verification is a system waiting to embarrass you.

Related terms:

WWGPTD

WWGPTD began as internal Slack shorthand to remind teams that using AI isn’t cheating but the essential first step. The accompanying bracelets serve to normalize AI as a fundamental tool for creating better work.

Token

In large language models, a token is the basic unit of text—usually chunks of three to four characters—that the model reads and generates. Since API costs, context windows, and rate limits are all measured in tokens, understanding tokenization is essential for controlling prompt length, cost, and model behavior.

Generative Engine Optimization

Generative engine optimization (GEO) is the practice of structuring content so AI systems—such as ChatGPT, Perplexity, Google AI Overviews, and Bing Copilot—cite and surface it when answering queries. Unlike traditional SEO, GEO prioritizes clear, BLUF-style definitions, structured data markup, and authoritative sourcing to ensure LLM retrieval and citation while avoiding filler.