Product Updates November 2024 - LLM Observability and Prompt management

In this product update we introduce LLM observability, prompt management, and a new interface to configure LLM-as-a-judge evaluators.

Mahmoud Mabrouk

Nov 26, 2024

5 minutes

It's been a while since our last product update. We've made big changes to Agenta, and we're thrilled to share them with you. In the last months, we've revamped our platform with prompt management, powerful observability tools, new docs and a new look.

Now, without more a due, let's “delve” into the new updates to Agenta.

Prompt management:

Managing LLM applications, especially with a team, can be challenging. We’ve built a comprehensive prompt and configuration management system to make it easier.Our new SDK offers all the tools you need for managing prompts—whether you want to create drafts, deploy to staging or production, or roll back to a previous version. Alongside the SDK, our new web UI empowers everyone on the team—from developers to subject matter experts—to:

Deploy prompts quickly
Roll back to previous versions
Compare versions side by side

Get started with prompt management in the docs.

OpenTelemetry LLM Observability

Now, with just a few lines of code, you can start capturing all the inputs, outputs, and metadata for your LLM applications—whether they're built with OpenAI, LangChain, or custom workflows.

LLM observability helps AI engineers debug applications, pinpoint root causes, and make informed decisions to improve prompts and architectures. Plus, you can track critical metrics like latency, costs, and performance over time—ensuring you're optimizing resource usage and staying ahead of model drifts.

We built our observability to be OpenTelemetry compliant. That means a robust, non-proprietary SDK with zero vendor lock-in and a ton of integrations out of the box.

Try this notebook to instrument a LangChain RAG app

Evaluator Playground for LLM-as-a-judge:

Configuring the evaluators and tests for application can be a bit tricky especially for complex evaluators such as LLM-as-a-judge. We've added an evaluator configuration playground, where you can test the configuration of the evaluator with real data to fine tune it before starting evaluations.

Check how to set-up your LLM-as-a-judge evals quickly

Web-UI redesign

A minor yet not minor update. We have refreshed (and continue refreshing) our whole UI and UX to make it more user friendly and much clearer.

That's it for this update. See you next time!

Now, without more a due, let's “delve” into the new updates to Agenta.

Prompt management:

Deploy prompts quickly
Roll back to previous versions
Compare versions side by side

Get started with prompt management in the docs.

OpenTelemetry LLM Observability

Now, with just a few lines of code, you can start capturing all the inputs, outputs, and metadata for your LLM applications—whether they're built with OpenAI, LangChain, or custom workflows.

We built our observability to be OpenTelemetry compliant. That means a robust, non-proprietary SDK with zero vendor lock-in and a ton of integrations out of the box.

Try this notebook to instrument a LangChain RAG app

Evaluator Playground for LLM-as-a-judge:

Check how to set-up your LLM-as-a-judge evals quickly

Web-UI redesign

A minor yet not minor update. We have refreshed (and continue refreshing) our whole UI and UX to make it more user friendly and much clearer.

That's it for this update. See you next time!

Now, without more a due, let's “delve” into the new updates to Agenta.

Prompt management:

Deploy prompts quickly
Roll back to previous versions
Compare versions side by side

Get started with prompt management in the docs.

OpenTelemetry LLM Observability

Now, with just a few lines of code, you can start capturing all the inputs, outputs, and metadata for your LLM applications—whether they're built with OpenAI, LangChain, or custom workflows.

We built our observability to be OpenTelemetry compliant. That means a robust, non-proprietary SDK with zero vendor lock-in and a ton of integrations out of the box.

Try this notebook to instrument a LangChain RAG app

Evaluator Playground for LLM-as-a-judge:

Check how to set-up your LLM-as-a-judge evals quickly

Web-UI redesign

A minor yet not minor update. We have refreshed (and continue refreshing) our whole UI and UX to make it more user friendly and much clearer.

That's it for this update. See you next time!

Article

LLM as a Judge: Guide to LLM Evaluation & Best Practices

A practical guide to LLM as a judge: design, implement, and automate LLM evaluation and RAG evaluation for your AI projects.

Sep 30, 2025

15 minutes

Article

LLM as a Judge: Guide to LLM Evaluation & Best Practices

A practical guide to LLM as a judge: design, implement, and automate LLM evaluation and RAG evaluation for your AI projects.

Sep 30, 2025

15 minutes

Comparison

Top LLM Gateways 2025

We compare and test the top LLM gateways in 2025. These includes Litellm, Helicone, BricksLLM and Kong AI Gateway.

Sep 30, 2025

10 minutes

Comparison

Top LLM Gateways 2025

We compare and test the top LLM gateways in 2025. These includes Litellm, Helicone, BricksLLM and Kong AI Gateway.

Sep 30, 2025

10 minutes

Comparison

Top LLM Observability platforms 2025

Explore the best LLM Observability platforms of 2025. Compare open-source and enterprise tools like Agenta, Langfuse, Langsmith and more.

Sep 29, 2025

10 minutes

Comparison

Top LLM Observability platforms 2025

Explore the best LLM Observability platforms of 2025. Compare open-source and enterprise tools like Agenta, Langfuse, Langsmith and more.

Sep 29, 2025

10 minutes

Checkout all articles

Article

LLM as a Judge: Guide to LLM Evaluation & Best Practices

A practical guide to LLM as a judge: design, implement, and automate LLM evaluation and RAG evaluation for your AI projects.

Sep 30, 2025

15 minutes

Comparison

Top LLM Gateways 2025

We compare and test the top LLM gateways in 2025. These includes Litellm, Helicone, BricksLLM and Kong AI Gateway.

Sep 30, 2025

10 minutes

Comparison

Top LLM Observability platforms 2025

Explore the best LLM Observability platforms of 2025. Compare open-source and enterprise tools like Agenta, Langfuse, Langsmith and more.

Sep 29, 2025

10 minutes

Checkout all articles

Fast-tracking LLM apps to production

Company

Resources

Need a demo?

We are more than happy to give a free demo

Book a demo

Fast-tracking LLM apps to production

Company

Resources

Need a demo?

We are more than happy to give a free demo

Book a demo

Fast-tracking LLM apps to production

Company

Resources

Need a demo?

We are more than happy to give a free demo

Book a demo