← foreveragents.dev

Testing

Agents are non-deterministic. Testing them requires different strategies than traditional software.

Principles

Testing levels

Unit tests for individual components. Integration tests for components together. Conversation tests for multi-turn flows:

- user: "What contexts are available?"
      expect:
        contains: ["darkmode", "accessibility", "privacy"]

    - user: "Tell me about dark mode"
      expect:
        contains: "eye comfort"
        not_contains: "ERROR"
    

Regression tests: reproduce the bug before fixing it.

Agent-specific patterns

For agents

  1. Adhere to strict TDD principples
  2. Cover happy path, error paths, and edges
  3. Keep tests fast — slow tests don't run
  4. Test guardrails as rigorously as features
  5. Treat flaky tests as bugs

← All contexts