November 10 - 14

Launch Week

#2

Join us for five days of exciting new announcements that will transform how you build and manage LLM applications

Build reliable LLM apps together with integrated prompt management, evaluation, and observability.

Monday, 10 November

Day 1: New Evaluation Workflow

Day 1:
New Evaluation Workflow

See how your prompts perform across all metrics at a glance

Compare prompt versions side by side to spot regressions fast

Debug with complete traces to understand every output

Customize LLM-as-a-judge evaluators with any schema you need

Tuesday, 12 November

Day 2: Online Evaluation

Day 2:
Online Evaluation

Live view of the reliability of your system in production

Gain confidence that your outputs meet your quality standards

Find edge cases and add them to your test cases to improve your AI system

Clear insight into how prompt changes behave in production

Wednesday, 13 November

Day 3: Evaluation SDK

Day 2:
Online Evaluation

Create or fetch test sets programatically

Write custom evaluators or use buiilt-in evaluators

Evaluate end to end or specific steps

View results in the dashboard

Thursday, 14 November

Day 4: Open Source Evaluation

Day 2:
Online Evaluation

All functional features now open source (MIT license)

Includes evaluation, prompt management, and observability

Development back in the public repo

Friday, 15 November

Day 5: Jinja2 Integration

Day 2:
Online Evaluation

Use Jinja2 in your prompt templates

Determine the syntax when fetching the prompt or use it in the gateway

Need help?

Frequently asked questions

What is Agenta?

Agenta is an open-source LLMOps platform that includes all the tool to adopt the best practices for building reliable LLM powered applications. It includes tools for prompt management, prompt engineering, LLM evaluation, and observability.

Who is Agenta for?

How does Agenta compare to building in-house?

Where is Agenta hosted?

Can I self-host Agenta for free?

Do you offer discounts to early-stage startups, students, non-profit projects, or open-source projects?

What is a billable trace? How is it different from an event?

Ship reliable agents faster with Agenta

Build reliable LLM apps together with integrated prompt
management, evaluation, and observability.

Start building

Read the docs

Ship reliable agents faster with Agenta

Build reliable LLM apps together with integrated prompt
management, evaluation, and observability.

Start building

Read the docs

Ship reliable agents faster with Agenta

Build reliable LLM apps together with integrated prompt
management, evaluation, and observability.

Start building

Read the docs

Fast tracking LLM apps
to production

Product

Company

Resources

Legal

Fast tracking LLM apps
to production

Product

Company

Resources

Legal

Fast tracking LLM apps
to production

Product

Company

Resources

Legal