Chapter 7: Maximizing model performance with input data
Data to models
Core ideas
Most “bad model” incidents are input problems; models mirror what you feed them.
Veterans optimize consistency alongside peak quality; wild misses destroy trust even when average metrics look fine.
Two levers: context engineering (transient, per-call shaping) vs fine-tuning (persistent weight change). Neither is a universal graduation step.
Context engineering weaves four threads: structured data, retrieved unstructured knowledge, memory over time, and prompt engineering that
synthesizes them.
Validate structured inputs before they hit the model; many “hallucinations” are faithful rendering of garbage-in.
Prototype richer prompts with caching in mind before defaulting to fine-tuning; watch obsolescence and versioning when you do tune.
Training framing: unsupervised builds familiarity, supervised builds correctness, preference tuning builds fit (think “train the model like a dog”).
Principles from the chapter
Models mirror their inputs; if you want better outputs, improve what you feed them.
Consistency is a performance metric common among veterans and rare among novices.
If humans are assigned to a task they are unfamiliar with and are able to reach the best solution using only context data, research and experience shows that good AI models usually can match or exceed human performance with the same context.
Best practice prompt templates create a shared foundation for disciplined variation and adaptation.
The memory feedback mechanism per task, user, and user/task combination can steadily drive performance improvement over time as it incorporates knowledge from previous successes and failures.
Encourage teams to prototype longer, richer prompts with caching in mind to achieve desired business performance metrics before resorting to fine-tuning.
Train the model like a dog: unsupervised builds familiarity, supervised builds correctness, preference tuning builds fit.
When fine-tuning, rigorous versioning and clearly defined checkpoints create a safety net to minimize disruptions when the inevitable snags occur.
Read the chapter for…
Tay as cautionary tale, the running fitness-app context example, memory architectures, fine-tuning economics including the “fine-tuning trap,” and worked tuning workflows.