← foreveragents.dev

Deployment

A foreveragent is a sovereign artifact: static files, local compute, and optional infrastructure. The design target is durability and independence, inspired by hacka.re:

The bare minimum

Open the files from disk. Double-click index.html. The agent runs in the browser with no server.

Core runtime stack:

  1. Local inference server (Ollama, llama.cpp, LM Studio) for model inference
  2. Edge-containers for isolated tool/runtime execution at the edge
  3. Bash tool for controlled command execution when agent workflows need shell primitives

In-browser inference (WebLLM) and in-browser execution (WASM) are optional graceful downgrades for zero-server deployments.

The file:// origin has CORS constraints: fetching across origins and some Web APIs won't work. For anything beyond a single-page self-contained agent, use a local server.

Local servers

# From the repo
    npx serve .

    # Or Python
    python3 -m http.server 8000

    # Or Go
    foreveragents --http :8080
    

Any static file server works. These solve file:// CORS limits and give you a proper http://localhost origin.

Shareable encrypted agents

Use URL-fragment encryption for portable agent packs:

This preserves portability while maintaining client-side secrecy by default.

GitHub Pages

Push the repo. Enable Pages on the main branch. Done.

Large files are the constraint — GitHub Pages has a soft 1GB limit per repo and individual files should stay under 100MB. For agents that bundle model weights or large assets, this matters.

Solution: keep large files out of the repo. Host weights on Hugging Face or a CDN. Reference them by URL. The agent fetches them at runtime.

Cloudflare Pages

Connect the repo. Cloudflare deploys on push with global CDN and automatic HTTPS.

Cloudflare also solves the GitHub Pages large-file problem. Workers can proxy/redirect external model assets, and R2 can store large weights.

IoT and edge

The static-content constraint is a feature. Hardware under $10 can serve a foreveragent over WiFi with HTML, CSS, JS, and markdown contexts.

For connectivity where WiFi doesn't reach, LoRaWAN can relay low-rate tokens while UI and context stay local.

Edge-containers extend this model:

Why this architecture lasts

The server is optional, not central. A Forever Agent can be:

The artifact survives hosting changes because runtime, state, and behavior are designed to travel. The absence of a required server is itself a verifiable privacy property — proof of architecture, not a promise in a terms-of-service.

For agents choosing a deployment

  1. Start with a local inference server (Ollama, llama.cpp) during development
  2. Verify local inference works before adding remote or in-browser dependencies
  3. Add edge-containers for tasks that need isolated execution
  4. Expose Bash execution as a guarded tool, not unrestricted shell access
  5. Consider in-browser inference (WebLLM) as a graceful downgrade for zero-server deployments
  6. Use encrypted URL fragments and QR exports for portable agent state
  7. Use Cloudflare/GitHub Pages for global static delivery when needed

← All contexts