Launch Week #2 Day 4: Open Sourcing Evaluation

We're open-sourcing all functional features of Agenta under the MIT license.

Nov 13, 2025

5 minutes

Ship reliable AI apps faster

Agenta is the open-source LLMOps platform: prompt management, evals, and LLM observability all in one place.

Star on Github

Get started

Today we're open sourcing the core of our product: evaluation.

All functional features of Agenta are now open source under the MIT license. This includes evaluation, prompt management, and observability. We're keeping only advanced enterprise collaboration features (RBAC, SSO, audit logs) under a separate license.

We also moved our development back to the public repository. The open source repo is now our main codebase, not a release mirror.

What This Means

You can now self-host the full Agenta platform with all the features you need to build reliable LLM applications.

What's open source:

Complete evaluation system (LLM-as-a-judge, test sets, custom evaluators)
Prompt playground and management
Observability and traces
All core workflows

What stays closed:

Enterprise collaboration features (RBAC, SSO/SAML/SCIM)
Audit logs and compliance features
Advanced org-level governance

Why We Did This

We tried three different open core models over the past two years. Each one taught us something about what works and what doesn't.

The short version: keeping evaluation closed meant we weren't building in public anymore. Contributors disappeared. Rich feedback became rare. Our open source project felt like a demo instead of a community.

We'd rather maximize adoption and community around the core than protect it behind a wall.

You can read the full story of what we tried and why we changed course here.

Get Started

The code is available now on GitHub: https://github.com/agenta-ai/agenta

Self-hosting guide: https://docs.agenta.ai/self-host/quick-start

This is day 4 of our launch week. One more day to go.

Today we're open sourcing the core of our product: evaluation.

We also moved our development back to the public repository. The open source repo is now our main codebase, not a release mirror.

What This Means

You can now self-host the full Agenta platform with all the features you need to build reliable LLM applications.

What's open source:

Complete evaluation system (LLM-as-a-judge, test sets, custom evaluators)
Prompt playground and management
Observability and traces
All core workflows

What stays closed:

Enterprise collaboration features (RBAC, SSO/SAML/SCIM)
Audit logs and compliance features
Advanced org-level governance

Why We Did This

We tried three different open core models over the past two years. Each one taught us something about what works and what doesn't.

We'd rather maximize adoption and community around the core than protect it behind a wall.

You can read the full story of what we tried and why we changed course here.

Get Started

The code is available now on GitHub: https://github.com/agenta-ai/agenta

Self-hosting guide: https://docs.agenta.ai/self-host/quick-start

This is day 4 of our launch week. One more day to go.

Today we're open sourcing the core of our product: evaluation.

We also moved our development back to the public repository. The open source repo is now our main codebase, not a release mirror.

What This Means

You can now self-host the full Agenta platform with all the features you need to build reliable LLM applications.

What's open source:

Complete evaluation system (LLM-as-a-judge, test sets, custom evaluators)
Prompt playground and management
Observability and traces
All core workflows

What stays closed:

Enterprise collaboration features (RBAC, SSO/SAML/SCIM)
Audit logs and compliance features
Advanced org-level governance

Why We Did This

We tried three different open core models over the past two years. Each one taught us something about what works and what doesn't.

We'd rather maximize adoption and community around the core than protect it behind a wall.

You can read the full story of what we tried and why we changed course here.

Get Started

The code is available now on GitHub: https://github.com/agenta-ai/agenta

Self-hosting guide: https://docs.agenta.ai/self-host/quick-start

This is day 4 of our launch week. One more day to go.

Mahmoud Mabrouk

Co-Founder Agenta & LLM Engineering Expert

Ship reliable agents faster with Agenta

Build reliable LLM apps together with integrated prompt
management, evaluation, and observability.

Start building

Read the docs

Ship reliable agents faster with Agenta

Build reliable LLM apps together with integrated prompt
management, evaluation, and observability.

Start building

Read the docs

Ship reliable agents faster with Agenta

Build reliable LLM apps together with integrated prompt
management, evaluation, and observability.

Start building

Read the docs

Fast tracking LLM apps
to production

Product

Company

Affiliate

Resources

Support

Fast tracking LLM apps
to production

Product

Company

Affiliate

Resources

Support

Fast tracking LLM apps
to production

Product

Company

Affiliate

Resources

Support

Launch Week #2 Day 4: Open Sourcing Evaluation

Launch Week #2 Day 4: Open Sourcing Evaluation

Launch Week #2 Day 4: Open Sourcing Evaluation

Ship reliable AI apps faster

What This Means

Why We Did This

Get Started

What This Means

Why We Did This

Get Started

What This Means

Why We Did This

Get Started

More from the Blog

More from the Blog

More from the Blog

Launch Week #2 Day 5: Jinja2 Prompt Templates

Commercial Open Source Is Hard: Our Journey

Launch Week #2 Day 3: Evaluation SDK

Launch Week #2 Day 5: Jinja2 Prompt Templates

Commercial Open Source Is Hard: Our Journey

Launch Week #2 Day 5: Jinja2 Prompt Templates

Commercial Open Source Is Hard: Our Journey

Launch Week #2 Day 3: Evaluation SDK

Launch Week #2 Day 2: Online Evaluation

Ship reliable agents faster with Agenta

Ship reliable agents faster with Agenta

Ship reliable agents faster with Agenta