Compare latency and costs
You can now compare the latency and cost of different variants in the evaluation view.
You can now compare the latency and cost of different variants in the evaluation view.
Toggle variants in comparison view
You can now toggle the visibility of variants in the comparison view, allowing you to compare a multitude of variants side-by-side at the same time.
Improvements
Bug fixes
We have added some more evaluators, a new string matching and a Levenshtein distance evaluation.
Improvements
Bug fixes
Bug fixes
We have improved the evaluation comparison view to show the difference to the expected output.
Improvements
Deployment versioning
You now have access to a history of prompts deployed to our three environments. This feature allows you to roll back to previous versions if needed.
Role-Based Access Control
You can now invite team members and assign them fine-grained roles in agenta.
Improvements
Bug fixes
Fixed bug in custom code evaluation aggregation. Up until know the aggregated result for custom code evalution where not computed correctly.
Fixed bug with Evaluation results not being exported correctly
Updated documentation for vision gpt explain images
Improved Frontend test for Evaluations
We've introduced the feature to version prompts, allowing you to track changes made by the team and revert to previous versions. To view the change history of the configuration, click on the sign in the playground to access all previous versions.
We have added a new evaluator to match JSON fields and added the possiblity to use other columns in the test set other than the correct_answer column as the ground truth.
We have improved error handling in evaluation to return more information about the exact source of the error in the evaluation view.
Improvements: