📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.
TL;DR
A recent Google whitepaper reveals that in AI-assisted software development, the model itself accounts for only about 10% of system behavior. The key to success lies in the harness and context engineering, which constitute the remaining 90%. This shifts focus from model improvements to configuration, verification, and strategic design.
A new whitepaper from Google, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, states that the AI model is only about 10% of the system in modern AI-driven development. The core message is that the harness and context engineering—not the model itself—drive most of the system’s behavior and performance. This insight challenges common assumptions and has significant implications for how organizations invest in AI tools.
The whitepaper highlights that, contrary to popular belief, the model’s capabilities are only a small part of what determines an AI system’s success. Instead, configuration, tooling, and context management—collectively called the harness—are responsible for approximately 90% of the system’s behavior. This includes prompts, rule sets, tools, and observability mechanisms that shape how the model functions within a larger framework.
Evidence cited in the paper shows that modifications to the harness, such as changing prompts or adding tools, can dramatically improve performance. For instance, a team improved a coding agent’s ranking from outside the top 30 to within the top 5 by adjusting only the harness, with no change to the underlying model. Similarly, tweaking middleware increased an agent’s benchmark score by nearly 14 points.
The authors argue that the focus should shift from chasing newer, larger models to developing and owning the harness infrastructure—since this is where the real control and competitive advantage lie. They also emphasize that most failures in AI agents are due to configuration errors, missing tools, or poor context management, not the model’s inherent limitations.
The model is only 10%
A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.
The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.
Implications for AI Development Strategies
This shift in understanding has major implications for organizations deploying AI. Instead of investing heavily in acquiring or developing the latest models, companies should prioritize building robust harnesses—tools, prompts, and verification processes—that shape and control AI behavior. This approach can lead to significant cost savings and improved reliability, as the harness is more manageable and customizable than constantly chasing model improvements.
Furthermore, recognizing that most failures are configuration-related underscores the importance of expertise in context engineering and system design. This reorientation could democratize AI development, making it accessible to teams that focus on system architecture rather than solely on model training or fine-tuning.
Overall, this perspective encourages a more strategic, system-oriented approach to AI, emphasizing control, verification, and cost-efficiency over raw model power.

AI Model Validation & Testing: Ensuring Reliable AI Systems — Bias Testing, Robustness Evaluation & Regulatory Compliance (AI Compliance Toolkit)
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Background on AI System Design and Recent Findings
The common narrative in AI development has been that larger, more sophisticated models are the key to better performance. However, the recent Google whitepaper challenges this by presenting evidence that the model itself accounts for only about 10% of the overall system behavior.
This insight aligns with ongoing industry observations: many AI failures stem from poor configuration, missing tools, or inadequate context management. The paper builds on earlier discussions about the importance of system design, testing, and verification—highlighting that these aspects are often overlooked in favor of model size and complexity.
Prior to this, the industry has seen rapid model improvements, but practical deployment issues persisted, suggesting that the bottleneck is less about the model and more about how it is integrated and controlled within systems. The whitepaper formalizes this understanding and provides concrete examples demonstrating the outsized influence of harness design.
“The model is only 10% of what determines AI system behavior; the harness and context are the other 90%.”
— Addy Osmani
AI system configuration software
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Remaining Uncertainties in AI System Optimization
While the whitepaper provides compelling evidence that harness and context are dominant, it does not specify precise metrics for how much performance improvements can be achieved solely through system configuration. It remains unclear how these principles scale across different domains or with future model advancements. Additionally, the long-term impact of this shift on AI development costs and organizational structures is still being evaluated.
AI observability and monitoring tools
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Next Steps for AI Development and Organizational Adoption
Organizations are likely to reevaluate their AI investment strategies, focusing more on developing sophisticated harnesses, verification processes, and system architecture. Industry leaders may prioritize training in system design and context engineering. Further research and case studies are expected to validate and refine these insights, potentially leading to new standards in AI deployment practices.
Additionally, tool vendors and AI platform providers may offer more configurable frameworks, emphasizing harness components and verification tools to capitalize on this paradigm shift.
prompt engineering tools for AI
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Key Questions
Why is the model only 10% of the system?
The whitepaper shows that most of the AI system’s behavior depends on how the model is configured, controlled, and integrated—collectively called the harness—which includes prompts, tools, rules, and observability mechanisms.
How can organizations improve AI performance according to this new insight?
By focusing on building and refining the harness—such as better prompts, tools, verification, and context management—rather than solely investing in larger or newer models.
Does this mean smaller models are better?
Not necessarily. The insight is that the model’s size is less critical than how it is used and controlled within the system. Effective harness design can significantly enhance performance even with smaller models.
What skills should AI teams develop now?
Focus on system architecture, context engineering, verification, and tooling—skills that enable effective harness design and management.
Will this change how AI products are built?
Yes. Emphasis will shift from model development to system configuration, testing, and verification, leading to more reliable and cost-effective AI solutions.
Source: ThorstenMeyerAI.com