Deployments and runs
AO has two core concepts: deployments and runs. Understanding the difference is key. A deployment is a immutable versioned snapshot of your agent. When you runao deploy, AO takes your code, builds a Docker image from it, and registers that image as a deployment. The deployment doesn’t execute anything, it’s just your agent packaged and ready to run. Every time you deploy, a new version is created. You can have multiple deployments and roll back to any previous one from the dashboard.
A run is a single execution of a deployment. When you trigger ao run, AO takes a deployment, spins up a fresh container from its image, passes your input in, and waits for it to finish. Every run is isolated, it starts clean, with no shared memory or state from previous runs.
What happens when you deploy
When you runao deploy:
- Your project directory is zipped and uploaded to AO
- AO reads your
ao.tomlto get the entrypoint, retries, and timeout config - A Docker image is built from your code, your
requirements.txtis installed, andao_runneris added on top of your entrypoint - The image is stored and the deployment is registered with status
building, thensucceededonce the build finishes - If you set a
croninao.toml, AO automatically schedules recurring runs using that pattern
ao_runner is a lightweight wrapper AO injects at build time. It sits between the container and your agent code and handles passing input, capturing logs, and reporting status back to AO, your agent code doesn’t need to know about any of this.
What happens when you run an agent
When you trigger a run:- AO creates a job and adds it to a durable queue backed by Redis
- A worker picks up the job and spins up a Docker container from your deployment image
- Your input is passed in as the
AGENT_INPUTenvironment variable and injects your input into the agent - The container runs your agent and AO waits for it to finish
- Exit code and status are captured and stored, visible in the dashboard
How retries work
If your agent fails: an exception, a timeout, an LLM rate limit, AO retries automatically. Retries use exponential backoff: the first retry waits 1 second, the second waits 2 seconds, the third waits 4 seconds, and so on. This prevents hammering an upstream API that’s temporarily down. Each retry gets two extra pieces of context injected as environment variables:AO_ATTEMPT- the current attempt number, starting at 1AO_LAST_ERROR- the error message from the previous attempt
failed and the final error is logged.
How state works
AO doesn’t impose a state model on your agent. Your code runs in a container and can use whatever storage it needs (Postgres, Redis, a file, anything). What AO does is make the infrastructure around state reliable. Specifically:- Each run gets a fresh isolated container, so there’s no accidental state leakage between runs
- If a run fails mid-execution and retries,
AO_LAST_ERRORtells your agent what went wrong so it can resume from a known point
What you can see in the dashboard
Every deployment and run is tracked. From the dashboard you can:- Rollback to any previous successful deployment with one click
- View run logs — stdout and stderr streamed line by line from the container
- Track retries — see how many attempts a run made and what error triggered each retry