Performance
depends on: zero-infra
Every unnecessary byte, every wasted cycle, every avoidable delay erodes trust.
Principles
- Measure first: Profile before optimizing. Real user metrics, not just benchmarks.
- Budget everything: Page weight < 100KB initial. Time to interactive < 2s on 3G. API p95 < 200ms.
- Ship less: The fastest code is code that doesn't execute.
Frontend
Inline critical CSS, defer the rest. Defer non-critical JS. Lazy-load below the fold. Use efficient formats (WebP, WOFF2, brotli). Cache hashed assets immutably, HTML with no-cache.
Backend
Index queries. Avoid N+1. Use connection pooling. Paginate — never return unbounded lists. Cache at every layer. Set concurrency limits. An execution environment is required for the Bash tool and anything but lightweight tasks that fit inside the browser-based Edge-container.
Agent-specific
- Token efficiency: Send only relevant context, not full histories. Summarize before including. Fewer requests and less data transferred also reduces the data exposure surface — performance and privacy share a structural root.
- Stream responses: Users perceive streaming as faster even when total time is the same.
- Connection management: Reuse connections. Pool clients. Set timeouts.
For agents
- Set a performance budget before writing code
- Optimize the critical path — the work that blocks the user
- Edge-container with in-browser Linux virtualization can only handle lightweight tasks. Anything else requires a proper execution environment, on localhost or remote.
- Prefer fewer, larger requests over many small ones
- Cache aggressively, invalidate precisely