Customize LLM-as-a-Judge Output Schemas
The LLM-as-a-Judge evaluator now supports custom output schemas. You can define exactly what feedback structure you need for your evaluations.
What's New
Flexible Output Types
Configure the evaluator to return different types of outputs:
- Binary: Return a simple yes/no or pass/fail score
- Multiclass: Choose from multiple predefined categories
- Custom JSON: Define any structure that fits your use case
Include Reasoning for Better Quality
Enable the reasoning option to have the LLM explain its evaluation. This improves prediction quality because the model thinks through its assessment before providing a score.
When you include reasoning, the evaluator returns both the score and a detailed explanation of how it arrived at that judgment.
Advanced: Raw JSON Schema
For complete control, provide a raw JSON schema. The evaluator will return responses that match your exact structure.
This lets you capture multiple scores, categorical labels, confidence levels, and custom fields in a single evaluation pass. You can structure the output however your workflow requires.
Use Custom Schemas in Evaluation
Once configured, your custom schemas work seamlessly in the evaluation workflow. The results display in the evaluation dashboard with all your custom fields visible.
This makes it easy to analyze multiple dimensions of quality in a single evaluation run.
Example Use Cases
Binary Score with Reasoning: Return a simple correct/incorrect judgment along with an explanation of why the output succeeded or failed.
Multi-dimensional Feedback: Capture separate scores for accuracy, relevance, completeness, and tone in one evaluation. Include reasoning for each dimension.
Structured Classification: Return categorical labels (excellent/good/fair/poor) along with specific issues found and suggestions for improvement.
Getting Started
To use custom output schemas with LLM-as-a-Judge:
- Open the evaluator configuration
- Select your desired output type (binary, multiclass, or custom)
- Enable reasoning if you want explanations
- For advanced use, provide your JSON schema
- Run your evaluation
Learn more in the LLM-as-a-Judge documentation.