Skip to main content

Roadmap

What we shipped, what we are building next, and what we plan to build.

Last Shipped

Webhooks and GitHub Automations for Prompt Deployments

Trigger CI and repository workflows when you deploy a prompt. Send deployment events to your own HTTPS endpoint or call GitHub directly with repository dispatch and workflow dispatch.

Tool Integrations in the Playground

PlaygroundIntegration

Connect 150+ external tools (Gmail, Slack, Notion, Google Sheets, GitHub) to your prompts directly from the playground. Authenticate with OAuth, attach tool actions, and execute tool calls with one click.

AI-Powered Prompt Refinement in the Playground

Refine prompts with AI directly in the playground. Describe what you want to improve and get a refined version with an explanation of the changes.

Enterprise Compliance Features

Multi-organization support, SSO with any OIDC provider, domain verification with auto-join, and a US region.

Folders for Prompt Organization

Create folders and subfolders to organize prompts. Drag prompts between folders and search across everything.

Navigation Links from Traces to App/Environment/Variant

Clickable links in observability traces to navigate to the application, variant, version, and environment used in each trace. Jump directly to the configuration that generated a specific trace.

Date Range Filtering in Metrics Dashboard

Filter traces by date range in the metrics dashboard. View metrics for the last 6 hours, 24 hours, 7 days, or 30 days.

In progress

Improving Navigation between Testsets in the Playground

We are making it easy to use and navigate in the playground with large testsets.

Prompt Snippets

Create reusable prompt snippets that can be referenced across multiple prompts. Reference specific versions or always use the latest version to maintain consistency across prompt variants.

Open Observability Spans Directly in the Playground

PlaygroundObservability

Add a button in observability to open any chat span directly in the playground. Creates a stateless playground session pre-filled with the exact prompt, configuration, and inputs for immediate iteration.

Running Evaluators in the Playground

PlaygroundEvaluation

Run evaluators directly in the playground to get immediate quality feedback on prompt changes. Evaluate outputs inline as you iterate on prompts. Scores, pass/fail results, and evaluator reasoning appear right next to the LLM response.

Annotation Queues for Traces

Annotation queues let you define a set of traces to review, assign them to team members, and track annotation progress — all from within Agenta.

Updates to the Evaluator Playground

A richer editing experience for LLM-as-a-Judge and other evaluators, with inline test runs and the ability to evaluate evaluators against a labeled test set to measure agreement with ground truth.

Creating Agents from the UI

Build and configure AI agents directly from the Agenta UI. Define agent workflows, select tools, and set up orchestration logic without writing code. Test and iterate on agent behavior in the playground, then deploy to production with versioning and observability built in.

Planned

Annotation Queue to Label Test Sets

EvaluationObservability

Turn annotated traces into labeled test cases directly from an annotation queue. Export reviewed traces with ground-truth labels as a new or existing test set in one action.

Usage Limits for Traces (Hard and Soft Caps)

Set usage limits for traces at the project level. Configure a hard cap to stop accepting new traces once the limit is reached, or a soft cap to receive an alert while continuing to accept traces. Gives teams cost predictability and control in production.

Prompt Caching in the SDK

We are adding the ability to cache prompts in the SDK.

Tagging Traces, Testsets, Evaluations and Prompts

We are adding the ability to tag traces, testsets, evaluations and prompts. This is useful for organizing and filtering your data.

Feature Requests

Upvote or comment on the features you care about or request a new feature.

Request a feature