createTally
Create a Tally container and run evaluations.
createTally()
The main entry point for running evaluations. Creates a Tally container that orchestrates the evaluation pipeline.
Import
import { createTally } from 'tally';
import type { Tally, TallyRunReport } from 'tally';createTally()
Creates a type-safe Tally container for running evaluations. Pass your data and evals directly - no need for intermediate evaluator objects.
Array of data containers to evaluate (Conversation[] or DatasetItem[]).
Array of eval definitions (single-turn, multi-turn, or scorer evals).
Optional shared evaluation context for all evals.
Single-turn run policy (which steps to evaluate).
Optional metadata for this run.
tally.run()
Executes the evaluation pipeline and returns a type-safe report.
Optional cache for metric results (avoids recomputing identical inputs).
Optional LLM execution options (temperature, retries, etc.).
Optional metadata to include in the report.
Example
import {
createTally,
defineSingleTurnEval,
defineMultiTurnEval,
thresholdVerdict,
} from 'tally';
import type { Conversation } from 'tally';
import { createAnswerRelevanceMetric, createRoleAdherenceMetric } from 'tally/metrics';
import { google } from '@ai-sdk/google';
const model = google('models/gemini-2.5-flash-lite');
// Create metrics
const answerRelevance = createAnswerRelevanceMetric({ provider: model });
const roleAdherence = createRoleAdherenceMetric({
expectedRole: 'helpful assistant',
provider: model,
});
// Create evals with verdict policies
const relevanceEval = defineSingleTurnEval({
name: 'Answer Relevance',
metric: answerRelevance,
verdict: thresholdVerdict(0.7),
});
const roleEval = defineMultiTurnEval({
name: 'Role Adherence',
metric: roleAdherence,
verdict: thresholdVerdict(0.8),
});
// Create Tally and run
const conversation: Conversation = {
id: 'conv-1',
steps: [
{
stepIndex: 0,
input: { role: 'user', content: 'Hello!' },
output: [{ role: 'assistant', content: 'Hi there! How can I help?' }],
timestamp: new Date(),
},
],
};
const tally = createTally({
data: [conversation],
evals: [relevanceEval, roleEval],
});
const report = await tally.run();
// Type-safe access via view API
const view = report.view();
const step0 = view.step(0);
console.log(step0['Answer Relevance']?.outcome?.verdict);