Temperature

Temperature is a parameter that controls how random a language model's output is. At temperature 0, the model always picks the most probable next token—producing deterministic, repetitive, but reliable output. At temperature 1, the model samples more broadly from its probability distribution—producing varied, creative, but less predictable output. Higher temperatures (above 1) make the model increasingly random, often incoherently so. The right temperature depends entirely on the task. For extracting structured data from documents, you want temperature 0—you need the same answer every time. For brainstorming marketing copy variations, you want 0.7 to 0.9—you need diversity. Most API defaults sit around 0.7, which is a reasonable middle ground that few teams bother to adjust. Tuning temperature is one of those small decisions that meaningfully affects output quality and is trivially easy to get right once you understand what it does.

Referenced in these posts:

The Alephic AI Thesis: 2025

The AI revolution will be dictated by three physical constraints—compute packaging capacity, energy availability, and organizational agility—that concentrate power in gravity wells. Whoever controls these choke points, not merely the best models, will shape the next decade of AI.

Related terms:

Inference

Inference is the process of running a trained model on new input to generate a prediction or output—such as sending a prompt to GPT-4 and receiving a response. Unlike training, which is costly and infrequent, inference occurs millions of times per day, with speed (tokens per second) and cost (dollars per million tokens) determining an AI feature’s responsiveness and economic viability.

Conway's Law

Conway’s Law states that organizations designing systems are constrained to produce designs mirroring their own communication structures. For example, separate sales, marketing, and support teams often yield a website organized into Shop, Learn, and Support sections—reflecting internal divisions rather than user needs.

Accretive Software

Accretive software refers to AI platforms that automatically absorb model improvements as margin expansion by treating models as interchangeable components and routing queries to the optimal model in real time. Rather than fighting obsolescence, these platforms convert every efficiency breakthrough into customer value or profit margin.