TDD: The Missing Protocol for Effective AI Assisted Software Development

Large Language Models Aren't as Simple as They SeemLarge language models (LLMs) have a fundamental flaw: they appear deceptively easy to use. That blinking cursor invites you to interact with an LLM as if it were another human being—one that understands your intent, the context of your questions, and the logic behind your requests. You type a prompt, hit send, and then disappointment sets in.While it's exciting to watch it generate multiple files of code for your new project, the moment you try to run that code, it often fails to work as intended or doesn't run at all. As you continue prompting it to iterate on previous outputs, the LLM may go in circles, change direction entirely, or simply stall.The core issue is that we ask LLMs to do too much while providing too little direction and context—effectively setting them up for failure. It's like the classic PB&J experiment, where a father follows his children's sandwich-making instructions literally. We believe we're being clear, but we drastically underestimate how much implicit context AI lacks about the problems we're trying to solve. What we need is a better communication protocol—one that structures our requests in a way AI tools can reliably understand and execute. Why AI Struggles With Large, Ambiguous ProblemsDespite their impressive capabilities, current LLMs consistently struggle with large, vague problems. The issue is not primarily token limitations or technical constraints; it's a deeper problem in how we frame our requests. When developers ask AI to "build a complete authentication system" or "create an e-commerce checkout flow," they inadvertently set the AI up to fail by:Providing insufficient context: LLMs lack the shared understanding that human teams build over time—such as architectural preferences, coding standards, and business constraints.Requesting solutions without boundaries: Without clear constraints, the AI must make countless assumptions, from state management approaches to error handling strategies.Omitting critical edge cases: Complex problems contain edge cases that, while obvious to experienced developers, are not explicitly stated for the AI.The challenge isn't that AI can't generate complex code—it's that humans struggle to fully articulate complex problems in a way machines can understand. This is why Test Driven Development (TDD) is such a powerful framework: it breaks problems into small, testable behaviors, providing structured context for the AI to generate focused, useful solutions.TDD: A Communication Protocol for AITest Driven Development is the practice of writing tests before implementing production code. To be clear, this doesn't require a dogmatic approach, nor does it mean you have to write every test yourself. TDD functions as a highly effective protocol for communicating objectives, constraints, and context to AI tools. Writing tests first offers several key benefits:It ensures tests get written: We've all made promises to write tests after building features, only to skip them under time pressure or write vague tests that don't actually validate anything.It breaks problems into manageable pieces: Writing tests forces you to think through expected behavior and uncover edge cases. This improves your solution's architecture and ensures each function is implemented with precision.It documents intended behavior: Tests make your code's functionality understandable to other developers (and your future self). Unlike documentation that quickly becomes outdated, tests remain relevant because they must be updated to pass. They live alongside your production code.These benefits directly align with what AI needs: clear specifications, manageable scope, and well-defined edge cases. By writing tests, you create a shared language that both humans and AI can understand. A Practical TDD Workflow With AIHere's how a typical AI assisted TDD workflow might look:Define test descriptions: Write descriptive test cases covering all requirements and edge cases.Implement a seed test: Write one complete test to establish conventions and patterns.Generate remaining tests: Use AI to complete the other test implementations.Review and refine: Validate that the AI-generated tests cover meaningful scenarios.Generate implementation code: With test requirements in place, use AI to generate the corresponding component or feature.Test and iterate: Run your test suite, fix any failures, and refine both code and tests as needed.This creates a virtuous cycle: each step provides clearer context for the next, improving the AI's performance as you go. Here's an example of the first step—writing just the descriptions:describe('PasswordStrengthMeter', () => { it.todo('allows a user to submit a password that meets all criteria') describe('when the password does not meet the required length', () => { it.todo('the Submit button is disabled') it.todo( 'displays the error "Too weak: Password must be

May 28, 2025 - 20:00

TDD: The Missing Protocol for Effective AI Assisted Software Development

Large Language Models Aren't as Simple as They Seem

Large language models (LLMs) have a fundamental flaw: they appear deceptively easy to use. That blinking cursor invites you to interact with an LLM as if it were another human being—one that understands your intent, the context of your questions, and the logic behind your requests. You type a prompt, hit send, and then disappointment sets in.

While it's exciting to watch it generate multiple files of code for your new project, the moment you try to run that code, it often fails to work as intended or doesn't run at all. As you continue prompting it to iterate on previous outputs, the LLM may go in circles, change direction entirely, or simply stall.

The core issue is that we ask LLMs to do too much while providing too little direction and context—effectively setting them up for failure. It's like the classic PB&J experiment, where a father follows his children's sandwich-making instructions literally.

We believe we're being clear, but we drastically underestimate how much implicit context AI lacks about the problems we're trying to solve. What we need is a better communication protocol—one that structures our requests in a way AI tools can reliably understand and execute.

Why AI Struggles With Large, Ambiguous Problems

Despite their impressive capabilities, current LLMs consistently struggle with large, vague problems. The issue is not primarily token limitations or technical constraints; it's a deeper problem in how we frame our requests. When developers ask AI to "build a complete authentication system" or "create an e-commerce checkout flow," they inadvertently set the AI up to fail by:

Providing insufficient context: LLMs lack the shared understanding that human teams build over time—such as architectural preferences, coding standards, and business constraints.
Requesting solutions without boundaries: Without clear constraints, the AI must make countless assumptions, from state management approaches to error handling strategies.
Omitting critical edge cases: Complex problems contain edge cases that, while obvious to experienced developers, are not explicitly stated for the AI.

The challenge isn't that AI can't generate complex code—it's that humans struggle to fully articulate complex problems in a way machines can understand. This is why Test Driven Development (TDD) is such a powerful framework: it breaks problems into small, testable behaviors, providing structured context for the AI to generate focused, useful solutions.

TDD: A Communication Protocol for AI

Test Driven Development is the practice of writing tests before implementing production code. To be clear, this doesn't require a dogmatic approach, nor does it mean you have to write every test yourself. TDD functions as a highly effective protocol for communicating objectives, constraints, and context to AI tools. Writing tests first offers several key benefits:

It ensures tests get written: We've all made promises to write tests after building features, only to skip them under time pressure or write vague tests that don't actually validate anything.
It breaks problems into manageable pieces: Writing tests forces you to think through expected behavior and uncover edge cases. This improves your solution's architecture and ensures each function is implemented with precision.
It documents intended behavior: Tests make your code's functionality understandable to other developers (and your future self). Unlike documentation that quickly becomes outdated, tests remain relevant because they must be updated to pass. They live alongside your production code.

These benefits directly align with what AI needs: clear specifications, manageable scope, and well-defined edge cases. By writing tests, you create a shared language that both humans and AI can understand.

A Practical TDD Workflow With AI

Here's how a typical AI assisted TDD workflow might look:

Define test descriptions: Write descriptive test cases covering all requirements and edge cases.
Implement a seed test: Write one complete test to establish conventions and patterns.
Generate remaining tests: Use AI to complete the other test implementations.
Review and refine: Validate that the AI-generated tests cover meaningful scenarios.
Generate implementation code: With test requirements in place, use AI to generate the corresponding component or feature.
Test and iterate: Run your test suite, fix any failures, and refine both code and tests as needed.

This creates a virtuous cycle: each step provides clearer context for the next, improving the AI's performance as you go. Here's an example of the first step—writing just the descriptions:

describe('PasswordStrengthMeter', () => {
    it.todo('allows a user to submit a password that meets all criteria')

    describe('when the password does not meet the required length', () => {
        it.todo('the Submit button is disabled')

        it.todo(
            'displays the error "Too weak: Password must be at least 8 characters" for passwords shorter than 8 characters'
        )
    })

    describe('when the password does not include special characters', () => {
        it.todo('the Submit button is disabled')

        it.todo(
            'displays the error "Password needs to include special characters (ex. !@#$%)" for passwords with at least 8 characters but no special chars'
        )
    })
})

At this point, you're only outlining behavior—no implementation code yet. Next, implement the first test to establish patterns for the LLM:

  it('allows a user to submit a password that meets all criteria', () => {
    render();
    const input = screen.getByLabelText('Password');
    const submitButton = screen.getByRole('button', { name: /submit/i });

    // Enter a valid password
    fireEvent.change(input, { target: { value: 'StrongP@ss123' } });

    // Submit button should be enabled
    expect(submitButton).not.toBeDisabled();
  });

Now you've provided the scaffolding and context the AI needs to generate additional tests in a consistent format. Tools like Cursor or GitHub Copilot can fill in the rest effectively because your prompt is structured and grounded in specific intent.

Once the tests are in place, you can ask the AI to generate the actual React component. With explicit requirements and automated validation, you remove guesswork and minimize manual debugging.

Benefits Beyond Better Code

While improving code quality is a clear benefit, TDD also enhances AI collaboration in several other ways:

Reduced context switching: A consistent protocol means less time spent re-explaining requirements. You can stay within your IDE using tools like Cursor, GitHub Copilot, or Amazon Q.
Improved team alignment: Tests become a shared language for human and AI collaborators alike.
Persistent documentation: Your test suite becomes living documentation that retains value across sessions and contributors.

By adopting TDD as a foundation for AI assisted development, you're not only writing better code—you're creating a scalable, efficient communication channel between yourself and your AI tools. The tests you scaffold become shared specifications that align developers, product managers, and AI systems, dramatically reducing rework and misunderstandings.

The goal isn't to replace human developers but to offload repetitive tasks so we can focus on creativity and architecture—where human expertise is irreplaceable. Start your next feature by writing tests first, then let AI help implement the solution. You'll deliver higher quality code faster, with greater confidence.