Testing
Test-driven development, running tests, coverage targets, and test structure.
TDD is Non-Negotiable
Matrix OS follows strict Test-Driven Development. Every feature starts with a failing test:
- Red -- write a failing test that describes the desired behavior
- Green -- write the minimum code to make it pass
- Refactor -- clean up while keeping tests green
No implementation without a failing test
If a test can't be written for a feature, its necessity is questioned. This is a core development principle.
Coverage Target
99-100% across kernel and gateway packages. Current state: 993+ tests across 85+ test files.
Running Tests
bun run test # Unit tests (~993 tests, ~11s)
bun run test:watch # Watch mode for development
bun run test:coverage # Generate coverage report
bun run test:integration # Integration tests (needs ANTHROPIC_API_KEY)Test Structure
Tests live in tests/ at the project root, organized by package:
tests/
kernel/ # Kernel unit tests
spawn.test.ts # spawnKernel() tests
options.test.ts # kernelOptions() tests
prompt.test.ts # buildSystemPrompt() tests
ipc-server.test.ts # IPC tool tests
agents.test.ts # Agent loading and parsing tests
hooks.test.ts # Hook behavior tests
soul.test.ts # SOUL identity tests
gateway/ # Gateway unit tests
dispatcher.test.ts # Dispatch queue tests
watcher.test.ts # File watcher tests
channels/ # Channel adapter tests
cron.test.ts # Cron service tests
heartbeat.test.ts # Heartbeat tests
shell/ # Shell component tests
integration/ # Integration tests (real API calls)
e2e/ # End-to-end tests (101 tests across 13 files)Test Categories
Unit Tests
Pure function tests with mocked dependencies. Use vi.mock() for external services. One test file per source module.
Integration Tests
Make real API calls to Claude. Requirements:
ANTHROPIC_API_KEYenvironment variable set- Uses Claude Haiku model to keep costs under $0.10 per run
- Run separately:
bun run test:integration
Contract Tests
Verify IPC tool schemas match their implementations. Ensure Zod schemas correctly validate inputs and tools return expected shapes.
E2E Tests
Full system tests covering 101 scenarios across 13 files. Test the complete flow from user input through the gateway to kernel response.
Spike Before Spec
For undocumented SDK behavior, write a throwaway spike test against the real SDK before committing to an approach. Spike files go in spike/ and are excluded from the main test suite. This prevents building on unverified assumptions.
Writing Tests
Conventions:
- Descriptive
describeblocks matching module structure - Test names describe the expected behavior, not the implementation
- Use
vi.mock()for external dependencies (API calls, file system) - Keep tests focused -- one assertion per
itblock when practical - Integration tests are clearly separated and use Claude Haiku
How is this guide?
