Customize LLM-as-a-Judge Output Schemas

November 10, 2025

The LLM-as-a-Judge evaluator now supports custom output schemas. You can define exactly what feedback structure you need for your evaluations.

What's New

Flexible Output Types

Configure the evaluator to return different types of outputs:

Binary: Return a simple yes/no or pass/fail score
Multiclass: Choose from multiple predefined categories
Custom JSON: Define any structure that fits your use case

Include Reasoning for Better Quality

Enable the reasoning option to have the LLM explain its evaluation. This improves prediction quality because the model thinks through its assessment before providing a score.

When you include reasoning, the evaluator returns both the score and a detailed explanation of how it arrived at that judgment.

Advanced: Raw JSON Schema

For complete control, provide a raw JSON schema. The evaluator will return responses that match your exact structure.

This lets you capture multiple scores, categorical labels, confidence levels, and custom fields in a single evaluation pass. You can structure the output however your workflow requires.

Use Custom Schemas in Evaluation

Once configured, your custom schemas work seamlessly in the evaluation workflow. The results display in the evaluation dashboard with all your custom fields visible.

This makes it easy to analyze multiple dimensions of quality in a single evaluation run.

Example Use Cases

Binary Score with Reasoning: Return a simple correct/incorrect judgment along with an explanation of why the output succeeded or failed.

Multi-dimensional Feedback: Capture separate scores for accuracy, relevance, completeness, and tone in one evaluation. Include reasoning for each dimension.

Structured Classification: Return categorical labels (excellent/good/fair/poor) along with specific issues found and suggestions for improvement.

Getting Started

To use custom output schemas with LLM-as-a-Judge:

Open the evaluator configuration
Select your desired output type (binary, multiclass, or custom)
Enable reasoning if you want explanations
For advanced use, provide your JSON schema
Run your evaluation

Learn more in the LLM-as-a-Judge documentation.

What's New​

Flexible Output Types​

Include Reasoning for Better Quality​

Advanced: Raw JSON Schema​

Use Custom Schemas in Evaluation​

Example Use Cases​

Getting Started​