Temperature
Temperature is a parameter that controls how random a language model's output is. At temperature 0, the model always picks the most probable next token—producing deterministic, repetitive, but reliable output. At temperature 1, the model samples more broadly from its probability distribution—producing varied, creative, but less predictable output. Higher temperatures (above 1) make the model increasingly random, often incoherently so. The right temperature depends entirely on the task. For extracting structured data from documents, you want temperature 0—you need the same answer every time. For brainstorming marketing copy variations, you want 0.7 to 0.9—you need diversity. Most API defaults sit around 0.7, which is a reasonable middle ground that few teams bother to adjust. Tuning temperature is one of those small decisions that meaningfully affects output quality and is trivially easy to get right once you understand what it does.
Referenced in these posts:
The Alephic AI Thesis: 2025
The AI revolution will be dictated by three physical constraints—compute packaging capacity, energy availability, and organizational agility—that concentrate...
Related terms:
Inference
Inference is the process of running a trained model on new input to generate a prediction or output—such as sending a prompt to GPT-4 and receiving a...
Conway's Law
Conway’s Law states that organizations designing systems are constrained to produce designs mirroring their own communication structures.
Accretive Software
Accretive software refers to AI platforms that automatically absorb model improvements as margin expansion by treating models as interchangeable components...