Skip to content

Development Patterns

Key patterns and principles from building Nexus (Conductor + AI Learning Service).

Core Philosophy

1. Deterministic Orchestration

The Conductor is deterministic TypeScript - no AI in the orchestration logic.

Why: Conductor cannot be "reasoned with". Enforcement is mechanical and predictable.

typescript
// ✅ GOOD - Deterministic
if (!checklist.allItemsComplete()) {
  throw new Error('Checklist incomplete');
}

// ❌ BAD - AI-influenced
const response = await ai.ask('Can we skip this?');

2. Evidence-Based Validation

Never trust claims. Always check actual artifacts.

Lesson from path confusion incident: Agents claimed to work on production code but worked in deleted directory.

typescript
// ✅ GOOD - Verify
const pathExists = await fs.pathExists(workingDir);
const files = await fs.readdir(workingDir);
const hasPackageJson = files.includes('package.json');

// ❌ BAD - Trust
const config = { workingDir: userInput }; // Assume valid

3. Fresh Validation Context

Validators should have ZERO knowledge of builder work.

Adversarial approach prevents confirmation bias.

typescript
// Builder saves manifest to Redis
await redis.set(`ticket:${id}:manifest`, JSON.stringify(manifest));

// Validator gets ONLY the manifest (no shared memory)
const validator = await spawnValidator({ manifestPath: '/tmp/manifest.json' });

Multi-Layered Prevention

When fixing bugs, build 4 layers of protection:

  1. Configuration Time - Validate at input (catch early)
  2. Dashboard Time - Visual alerts (impossible to miss)
  3. Runtime Time - Explicit checks (warn during work)
  4. Documentation Time - Process guides (prevent repeat)

Example: Path confusion prevention

  • Layer 1: Real-time path validation API in config UI
  • Layer 2: Red dashboard alert for invalid paths
  • Layer 3: Agent context warnings about common mistakes
  • Layer 4: Incident report + prevention guide

See Path Confusion Prevention for full implementation.

Incident Response

When things break:

  1. Investigate - Don't make hasty decisions
  2. Implement Multi-Layer - Prevention at all 4 layers
  3. Document - Incident report + prevention guide
  4. Verify - Test that prevention actually works

Example: October 10, 2025 Path Confusion

State Management

Redis is a Cache

Critical lesson from data loss incident: Budget data stored ONLY in Redis was lost when flushed.

Rules:

  • Redis for operational state (fast, volatile)
  • External storage for financial/audit data (durable)
  • Backup critical Redis data regularly
  • Metrics reconstructable from GitHub PRs

Redis Key Patterns

project:{projectId}:active_ticket
project:{projectId}:config
project:{projectId}:completed_tickets
project:{projectId}:budget_total

ticket:{ticketId}:state
ticket:{ticketId}:manifest
ticket:{ticketId}:validation_report
ticket:{ticketId}:metrics
ticket:{ticketId}:budget

conductor:health
conductor:last_poll:{projectId}

State Transitions

pending → building → validating → completed

                  failed

All transitions are deterministic. No AI decision-making in state changes.

Work First, Bureaucracy Second

Most important lesson from production incidents: Never let external system failures block work completion.

typescript
// ✅ GOOD - Complete work regardless
try {
  await jiraClient.updateTicketStatus(id, 'Done');
} catch (error) {
  logger.warn('Jira failed, but work complete - continuing');
  // Work completion continues
}

// ❌ BAD - Block on external system
await jiraClient.updateTicketStatus(id, 'Done'); // If this fails, ticket fails

Apply to: Jira, GitHub, Slack, email - NOT to core validation (builder/validator must pass).

Cost Optimization

Use appropriate models for each task:

  • Builder: Claude Sonnet 4.5 ($3/$15 per 1M tokens) - Needs quality
  • Validator: Claude Haiku ($0.25/$1.25 per 1M tokens) - 90% cheaper, sufficient
  • Test Runner: Claude Haiku - Cost-optimized

Typical costs:

  • Simple ticket: $0.50 - $2.00
  • Complex ticket: $2.00 - $5.00
  • Multi-retry: $5.00 - $10.00

Budget limits prevent runaway costs.

Self-Healing

Systems should recover automatically:

Validator Retries: If Reality Validator finds issues, automatically retry with corrections (up to 2 attempts).

typescript
while (validatorRetryCount <= 2) {
  const report = await validator.run();

  if (report.recommendation === 'needs_rework') {
    const correctionPrompt = generateCorrections(report);
    await builder.resume(correctionPrompt);
    validatorRetryCount++;
  } else {
    break; // Success
  }
}

Jira Fallback: Continue even if Jira fails (work first, bureaucracy second).

See Self-Healing Documentation for details.

Agent Context Enhancement

When agents fail repeatedly on same issue, add warnings to agent context:

Example: Path confusion warnings

typescript
const agentContext = `
  CRITICAL: Verify working directory before changes!

  Common mistake: Working in deleted monorepo instead of production.
  ✓ Check: /apps/zeron-feedback-service/ (production)
  ✗ Avoid: /apps/zeron/feedback-service/ (deleted)

  STOP if you're in the wrong directory.
`;

Prevents repeat failures by making agents aware of common pitfalls.

Testing Strategy

Evidence-Based Tests

Test actual behavior, not mocked responses:

typescript
// ✅ GOOD - Real filesystem
test('validates path', async () => {
  const tmpDir = await fs.mkdtemp('/tmp/test-');
  await fs.writeFile(`${tmpDir}/package.json`, '{}');

  const result = await validatePath(tmpDir);
  expect(result.valid).toBe(true);

  await fs.remove(tmpDir);
});

// ❌ BAD - Mocked
test('validates path', async () => {
  jest.spyOn(fs, 'pathExists').mockResolvedValue(true);
  // Not testing real behavior
});

Integration Over Unit

Test workflows end-to-end when possible. Unit tests can miss integration issues.

Key Learnings

  1. Redis is volatile - Back up financial data
  2. External systems fail - Don't block work on them
  3. Agents make mistakes - Build 4 layers of protection
  4. Evidence over claims - Always verify filesystem
  5. Deterministic wins - No AI in orchestration logic
  6. Fresh validators - No shared context with builders
  7. Self-healing works - Retry with corrections before failing

Related:

Part of the Zeron Platform | Built with VitePress