Unified Invoke API

April 14, 2026

Guide updated 2026-04-28

The original version of this guide did not mention the prompt configuration fetch endpoint and lacked several minor endpoints. This update covers them all.

Deprecation deadline: May 31, 2026

Every legacy REST endpoint covered in this guide will stop responding on May 31, 2026. Until then, an adapter layer keeps the old endpoints working. Update your integration before that date.

Overview

v0.96.0 restructures the REST API. Two changes affect most users who integrate with Agenta for prompt management:

Invoking a deployed prompt moves to a single unified endpoint: POST /services/{service}/v0/invoke, replacing /run, /generate, /generate_deployed, and /test.
Fetching a prompt configuration moves to POST /applications/revisions/retrieve, replacing POST /variants/configs/fetch.

The release also restructures the rest of the REST CRUD surface (apps, variants, environments, traces, annotations, invocations). If you call any of those directly, see the complete list of affected endpoints at the end of this guide.

SDK users don't need to migrate

If you only call Agenta through the Python SDK (agenta.ConfigManager, agenta.VariantManager, agenta.DeploymentManager), you don't need to change anything. The SDK absorbs every change in this release.

Invoking a deployed prompt

All invocations now go through one endpoint:

POST /services/{service}/v0/invoke

This replaces four legacy endpoints:

POST /services/{service}/generate (draft testing)
POST /services/{service}/test (draft testing)
POST /services/{service}/generate_deployed (deployed config)
POST /services/{service}/run (deployed config)

New request format

The most common case is calling a prompt deployed to an environment with generate_deployed. Here is how that changes end-to-end.

Before (POST /services/completion/generate_deployed):

response = requests.post(
    "https://cloud.agenta.ai/services/completion/generate_deployed",
    headers={
        "Content-Type": "application/json",
        "Authorization": f"ApiKey {api_key}",
    },
    json={
        "environment": "production",
        "app": "my-app",
        "inputs": {"country": "France"},
    },
)

result = response.json()
print(result["data"])     # "The capital of France is Paris."
print(result["tree_id"])  # trace ID for observability

After (POST /services/completion/v0/invoke):

response = requests.post(
    "https://cloud.agenta.ai/services/completion/v0/invoke",
    headers={
        "Content-Type": "application/json",
        "Authorization": f"ApiKey {api_key}",
    },
    json={
        "data": {
            "inputs": {"country": "France"},
        },
        "references": {
            "application": {"slug": "my-app"},
            "environment": {"slug": "production"},
        },
    },
)

result = response.json()
print(result["data"]["outputs"])  # "The capital of France is Paris."
print(result["trace_id"])         # trace ID for observability

The new request body has two top-level sections. data carries your inputs and optional configuration parameters. references identifies which application, variant, revision, or environment to target.

The key field-by-field changes are:

inputs moves under data.inputs.
messages moves under data.inputs.messages (for chat applications).
ag_config becomes data.parameters.
The flat targeting fields (app, variant_slug, variant_version, environment) become structured entries under references.
The ?application_id= query parameter is removed. Pass the ID in references.application.id instead.

New response format

Before:

{
  "version": "3.0",
  "data": "The capital of France is Paris.",
  "content_type": "text/plain",
  "tree_id": "0ef1d6b7-84c3-4b8a-705b-ae5974e51954"
}

After:

{
  "version": "2025-07-14",
  "status": {"code": null, "message": null, "stacktrace": null},
  "trace_id": "0ef1d6b7-84c3-4b8a-705b-ae5974e51954",
  "span_id": "a1b2c3d4e5f6",
  "data": {
    "outputs": "The capital of France is Paris."
  }
}

The output value moves from data (a direct string) to data.outputs (nested). The tree_id field becomes trace_id, and we now return a span_id alongside it. The content_type field is removed. A new status object reports the operation code, an optional human-readable message, and a stacktrace when the call fails.

Field reference

The following table maps every old field, endpoint, and response key to its new equivalent.

Old	New
`POST /services/{svc}/run`	`POST /services/{svc}/v0/invoke`
`POST /services/{svc}/generate_deployed`	`POST /services/{svc}/v0/invoke`
`POST /services/{svc}/generate`	`POST /services/{svc}/v0/invoke`
`POST /services/{svc}/test`	`POST /services/{svc}/v0/invoke`
`"inputs": {...}`	`"data": {"inputs": {...}}`
`"messages": [...]`	`"data": {"inputs": {"messages": [...]}}`
`"ag_config": {...}`	`"data": {"parameters": {...}}`
`"environment": "prod"`	`"references": {"environment": {"slug": "prod"}}`
`"app": "my-app"`	`"references": {"application": {"slug": "my-app"}}`
`"variant_slug": "v1"`	`"references": {"application_variant": {"slug": "v1"}}`
`"variant_version": 1`	`"references": {"application_revision": {"version": "1"}}`
Response `"data": "text"`	Response `"data": {"outputs": "text"}`
Response `"tree_id"`	Response `"trace_id"`

More examples: chat applications and draft configurations

Invoking a chat application

Chat applications follow the same pattern. Messages move under data.inputs.messages.

Before:

response = requests.post(
    "https://cloud.agenta.ai/services/chat/run",
    headers={
        "Content-Type": "application/json",
        "Authorization": f"ApiKey {api_key}",
    },
    json={
        "environment": "production",
        "app": "my-chat",
        "inputs": {"context": "Be helpful."},
        "messages": [{"role": "user", "content": "Hello!"}],
    },
)

After:

response = requests.post(
    "https://cloud.agenta.ai/services/chat/v0/invoke",
    headers={
        "Content-Type": "application/json",
        "Authorization": f"ApiKey {api_key}",
    },
    json={
        "data": {
            "inputs": {
                "context": "Be helpful.",
                "messages": [{"role": "user", "content": "Hello!"}],
            },
        },
        "references": {
            "application": {"slug": "my-chat"},
            "environment": {"slug": "production"},
        },
    },
)

Testing with a draft configuration

If you previously sent an inline ag_config to test a draft, that block now lives under data.parameters.

Before:

response = requests.post(
    "https://cloud.agenta.ai/services/completion/generate?application_id=xxx",
    headers={
        "Content-Type": "application/json",
        "Authorization": f"ApiKey {api_key}",
    },
    json={
        "inputs": {"country": "France"},
        "ag_config": {
            "prompt": {
                "messages": [{"role": "user", "content": "Capital of {{country}}?"}],
                "llm_config": {"model": "gpt-4o-mini"},
                "template_format": "curly",
            }
        },
    },
)

After:

response = requests.post(
    "https://cloud.agenta.ai/services/completion/v0/invoke",
    headers={
        "Content-Type": "application/json",
        "Authorization": f"ApiKey {api_key}",
    },
    json={
        "data": {
            "inputs": {"country": "France"},
            "parameters": {
                "prompt": {
                    "messages": [{"role": "user", "content": "Capital of {{country}}?"}],
                    "llm_config": {"model": "gpt-4o-mini"},
                    "template_format": "curly",
                }
            },
        },
        "references": {
            "application": {"id": "xxx"},
        },
    },
)

Fetching a prompt configuration

POST /variants/configs/fetch is deprecated and superseded by POST /applications/revisions/retrieve. The new endpoint mirrors the shape of the invocation endpoint: you pass structured references to identify the variant or environment you want to fetch, and the response carries the configuration parameters together with metadata about the revision.

Fetching by variant and version

Before:

curl -X POST "https://cloud.agenta.ai/api/variants/configs/fetch" \
  -H "Content-Type: application/json" \
  -H "Authorization: ApiKey $AGENTA_API_KEY" \
  -d '{
    "variant_ref": {"slug": "default", "version": 1},
    "application_ref": {"slug": "my-app"}
  }'

After:

curl -X POST "https://cloud.agenta.ai/api/applications/revisions/retrieve" \
  -H "Content-Type: application/json" \
  -H "Authorization: ApiKey $AGENTA_API_KEY" \
  -d '{
    "application_ref": {"slug": "my-app"},
    "application_variant_ref": {"slug": "default"},
    "application_revision_ref": {"version": "1"}
  }'

info

The version field is now a string, not an integer.

Fetching by environment

To fetch the configuration deployed to an environment, pass the application_ref and the environment_ref together:

Before:

curl -X POST "https://cloud.agenta.ai/api/variants/configs/fetch" \
  -H "Content-Type: application/json" \
  -H "Authorization: ApiKey $AGENTA_API_KEY" \
  -d '{
    "environment_ref": {"slug": "production"},
    "application_ref": {"slug": "my-app"}
  }'

After:

curl -X POST "https://cloud.agenta.ai/api/applications/revisions/retrieve" \
  -H "Content-Type: application/json" \
  -H "Authorization: ApiKey $AGENTA_API_KEY" \
  -d '{
    "application_ref": {"slug": "my-app"},
    "environment_ref": {"slug": "production"}
  }'

Response shape

The response shape changed too. The legacy endpoint returned a flat envelope with the configuration under params and reference IDs as siblings. The new endpoint wraps the revision in a query envelope, with the configuration nested under application_revision.data.parameters.

Before:

{
  "params": {
    "prompt": {"...": "..."}
  },
  "url": "https://cloud.agenta.ai/services/completion/v0",
  "application_ref": {
    "slug": "my-app",
    "id": "019ce1cd-ccec-76f1-a803-eb1dc9187d57"
  },
  "variant_ref": {
    "slug": "default",
    "version": 1,
    "id": "019ce1cd-cd98-7270-b51b-550c58e51e41"
  },
  "environment_ref": {
    "slug": "production",
    "version": 1,
    "id": "..."
  }
}

After:

{
  "count": 1,
  "application_revision": {
    "id": "019ce1d0-7920-76a0-84b3-bb5a29750634",
    "slug": "default",
    "name": "default",
    "version": "1",
    "application_id": "019ce1cd-ccec-76f1-a803-eb1dc9187d57",
    "application_variant_id": "019ce1cd-cd98-7270-b51b-550c58e51e41",
    "data": {
      "parameters": {
        "prompt": {"...": "..."}
      },
      "url": "https://cloud.agenta.ai/services/completion/v0",
      "uri": "agenta:builtin:completion:v0",
      "schemas": {"...": "..."}
    },
    "flags": {"is_application": true, "is_chat": false, "...": "..."},
    "created_at": "2026-03-12T11:31:02.045131Z"
  }
}

If you read fields from the response, here is where each one moved.

Old	New
`response["params"]`	`response["application_revision"]["data"]["parameters"]`
`response["params"]["prompt"]`	`response["application_revision"]["data"]["parameters"]["prompt"]`
`response["url"]`	`response["application_revision"]["data"]["url"]`
`response["application_ref"]["id"]`	`response["application_revision"]["application_id"]`
`response["variant_ref"]["id"]`	`response["application_revision"]["application_variant_id"]`
`response["variant_ref"]["version"]`	`response["application_revision"]["version"]`

The new response also exposes the input/output schemas under data.schemas and a richer set of flags (including is_chat, has_url, etc.), which were not available in the legacy response.

Python SDK unchanged

If you fetch configurations through the SDK, you don't need to change anything:

config = ag.ConfigManager.get_from_registry(
    app_slug="my-app",
    environment_slug="production",
)

The SDK calls the new endpoint internally and returns the same shape it always has.

All affected endpoints

Below is the complete list of REST endpoints moved in v0.96.0, grouped by domain.

Application invocation

Operation	Old	New
Invoke deployed prompt	`POST /services/{svc}/run`	`POST /services/{svc}/v0/invoke`
Invoke deployed prompt (alias)	`POST /services/{svc}/generate_deployed`	`POST /services/{svc}/v0/invoke`
Test with draft config	`POST /services/{svc}/generate`	`POST /services/{svc}/v0/invoke`
Test with draft config (alias)	`POST /services/{svc}/test`	`POST /services/{svc}/v0/invoke`
Invoke a workflow	`POST /workflows/invoke`	`POST /services/{svc}/v0/invoke`
Inspect a workflow revision	`POST /workflows/inspect`	`POST /applications/revisions/resolve`

Apps CRUD

Operation	Old	New
List apps	`GET /apps`	`POST /simple/applications/query`
Create app	`POST /apps`	`POST /simple/applications/`
Fetch app	`GET /apps/{id}`	`GET /applications/{id}`
Update app	`PATCH /apps/{id}`	`PUT /applications/{id}`
Delete app	`DELETE /apps/{id}`	`POST /applications/{id}/archive`
List variants of an app	`GET /apps/{id}/variants`	`POST /applications/variants/query`
List environments of an app	`GET /apps/{id}/environments`	`POST /preview/environments/query`
Get deployed revision for an environment	`GET /apps/{id}/revisions/{environment_name}`	`POST /applications/revisions/retrieve`
Get variant deployed to an environment	`GET /apps/get_variant_by_env`	`POST /applications/revisions/retrieve`
Create variant from template	`POST /apps/{id}/variant/from-template`	`GET /applications/catalog/templates/` then `POST /applications/variants/`
Create variant from service URL	`POST /apps/{id}/variant/from-service`	`POST /applications/variants/`

Variants CRUD

Operation	Old	New
Fetch a variant	`GET /variants/{id}`	`GET /applications/variants/{id}`
Delete a variant	`DELETE /variants/{id}`	`POST /applications/variants/{id}/archive`
List variant revisions	`GET /variants/{id}/revisions`	`POST /applications/revisions/log`
Fetch a variant revision	`GET /variants/{id}/revisions/{revision_number}`	`POST /applications/revisions/retrieve`
Delete a variant revision	`DELETE /variants/{id}/revisions/{revision_id}`	`POST /applications/revisions/{id}/archive`
Update variant parameters	`PUT /variants/{id}/parameters`	`POST /applications/revisions/commit`
Update variant service URL	`PUT /variants/{id}/service`	`PUT /applications/variants/{id}`
Fork a variant from a base	`POST /variants/from-base`	`POST /applications/variants/fork`
Query variant revisions	`POST /variants/revisions/query`	`POST /applications/revisions/query`

Variant configurations

The git-style operations on variant configurations (commit, fork, history) are now exposed under the /applications/revisions/* namespace.

Operation	Old	New
Fetch a configuration	`POST /variants/configs/fetch`	`POST /applications/revisions/retrieve`
Create a variant with initial config	`POST /variants/configs/add`	`POST /applications/variants/`
Commit a new configuration revision	`POST /variants/configs/commit`	`POST /applications/revisions/commit`
Delete a variant	`POST /variants/configs/delete`	`POST /applications/variants/{id}/archive`
Deploy a configuration to an environment	`POST /variants/configs/deploy`	`POST /applications/revisions/deploy`
Fork a variant	`POST /variants/configs/fork`	`POST /applications/variants/fork`
Get the revision history of a variant	`POST /variants/configs/history`	`POST /applications/revisions/log`
List variants of an app	`POST /variants/configs/list`	`POST /applications/variants/query`
Query variants with filters	`POST /variants/configs/query`	`POST /applications/variants/query`

Environments and deployments

Deployment is now an action on a revision

In v0.95 you deployed by calling POST /environments/deploy with a variant ID and an environment name. The environment was the actor; the deployment was a side-effect on the environment.

In v0.96 you deploy by calling POST /applications/revisions/deploy with a revision reference and a target environment reference. The revision is the actor; the environment is just where you point it.

The legacy /configs/* endpoints exposed environment-side configuration directly. They split into two new namespaces: /applications/revisions/* for the revision lookup itself, and /preview/environments/revisions/* for the environment's deployment history (when, who, and which revision was deployed at each point).

Operation	Old	New
Deploy to environment	`POST /environments/deploy`	`POST /applications/revisions/deploy`
List configurations	`GET /configs`	`POST /applications/revisions/query`
Fetch a deployed configuration	`GET /configs/deployment/{revision_id}`	`POST /preview/environments/revisions/retrieve`
Revert a deployment	`POST /configs/deployment/{revision_id}/revert`	`POST /preview/environments/revisions/commit`

Containers

Operation	Old	New
List built-in templates	`GET /containers/templates`	`GET /applications/catalog/templates/`

Annotations

Annotations are observability records you attach to a trace or span to capture things like a human rating, an evaluator score, or a correction. They previously lived under /annotations/* and /preview/annotations/*. Both namespaces were removed. Annotations now share the unified /simple/traces/* API with other observability records.

Operation	Old	New
Create an annotation	`POST /annotations/`	`POST /simple/traces/`
Query annotations	`POST /annotations/query`	`POST /simple/traces/query`
Fetch an annotation by trace	`GET /annotations/{trace_id}`	`GET /simple/traces/{trace_id}`
Fetch an annotation by trace and span	`GET /annotations/{trace_id}/{span_id}`	`GET /simple/traces/{trace_id}`
Update an annotation by trace	`PATCH /annotations/{trace_id}`	`PATCH /simple/traces/{trace_id}`
Update an annotation by trace and span	`PATCH /annotations/{trace_id}/{span_id}`	`PATCH /simple/traces/{trace_id}`
Delete an annotation by trace	`DELETE /annotations/{trace_id}`	`DELETE /simple/traces/{trace_id}`
Delete an annotation by trace and span	`DELETE /annotations/{trace_id}/{span_id}`	`DELETE /simple/traces/{trace_id}`
Create an annotation (preview)	`POST /preview/annotations/`	`POST /simple/traces/`
Query annotations (preview)	`POST /preview/annotations/query`	`POST /simple/traces/query`
Fetch an annotation (preview)	`GET /preview/annotations/{trace_id}/{span_id}`	`GET /simple/traces/{trace_id}`
Update an annotation (preview)	`PATCH /preview/annotations/{trace_id}/{span_id}`	`PATCH /simple/traces/{trace_id}`
Delete an annotation (preview)	`DELETE /preview/annotations/{trace_id}/{span_id}`	`DELETE /simple/traces/{trace_id}`

Invocations

Invocations are observability records that capture a single LLM call: the input, the output, latency, cost, and the model used. They previously lived under /invocations/*. They now share the unified /simple/traces/* API with annotations, distinguished by the kind field on each record.

Operation	Old	New
Create an invocation	`POST /invocations/`	`POST /simple/traces/`
Query invocations	`POST /invocations/query`	`POST /simple/traces/query`
Fetch an invocation by trace	`GET /invocations/{trace_id}`	`GET /simple/traces/{trace_id}`
Fetch an invocation by trace and span	`GET /invocations/{trace_id}/{span_id}`	`GET /simple/traces/{trace_id}`
Update an invocation by trace	`PATCH /invocations/{trace_id}`	`PATCH /simple/traces/{trace_id}`
Update an invocation by trace and span	`PATCH /invocations/{trace_id}/{span_id}`	`PATCH /simple/traces/{trace_id}`
Delete an invocation by trace	`DELETE /invocations/{trace_id}`	`DELETE /simple/traces/{trace_id}`
Delete an invocation by trace and span	`DELETE /invocations/{trace_id}/{span_id}`	`DELETE /simple/traces/{trace_id}`

Spans

Operation	Old	New
Create a span	`POST /preview/spans/`	`POST /simple/traces/`
Ingest spans in bulk	`POST /preview/spans/ingest`	OTLP ingestion endpoint (see Observability docs)

Need help?

If you have questions about migrating, reach out on Slack or reply to the migration email. We are happy to help.

Overview​

Invoking a deployed prompt​

New request format​

New response format​

Field reference​

Invoking a chat application​

Testing with a draft configuration​

Fetching a prompt configuration​

Fetching by variant and version​

Fetching by environment​

Response shape​

Python SDK unchanged​

All affected endpoints​

Application invocation​

Apps CRUD​

Variants CRUD​

Variant configurations​

Environments and deployments​

Containers​

Annotations​

Invocations​

Spans​

Need help?​

Overview

Invoking a deployed prompt

New request format

New response format

Field reference

Invoking a chat application

Testing with a draft configuration

Fetching a prompt configuration

Fetching by variant and version

Fetching by environment

Response shape

Python SDK unchanged

All affected endpoints

Application invocation

Apps CRUD

Variants CRUD

Variant configurations

Environments and deployments

Containers

Annotations

Invocations

Spans

Need help?