Normalizers

Normalizers convert raw metric values into normalized scores in the 0–1 range. This enables consistent comparison and combination of metrics with different scales.

Import

import {
  createMinMaxNormalizer,
  createZScoreNormalizer,
  createThresholdNormalizer,
  createLinearNormalizer,
  createOrdinalMapNormalizer,
  createIdentityNormalizer,
  createCustomNormalizer,
} from '@tally-evals/tally/normalization';

Normalizer Factories

`createMinMaxNormalizer()`

Normalizes values using min-max scaling. Maps a value within a known range to a 0–1 score. Most commonly used for LLM scores (e.g., 1-5 → 0-1).

min:number

Minimum value in the expected range.

max:number

Maximum value in the expected range.

clip?:boolean

If true, clamps output to [0, 1].

direction?:'higher' | 'lower'

'higher' = higher values are better (default). 'lower' = lower values are better.

Example:

// LLM scores 0-5 → normalized 0-1
const normalizer = createMinMaxNormalizer({ min: 0, max: 5, clip: true });

`createZScoreNormalizer()`

Normalizes values using z-score standardization. Centers values around the mean and scales by standard deviation. Best for distributions where you want to measure deviation from "typical" values.

mean:number

Mean value for z-score calculation.

stdDev:number

Standard deviation for z-score calculation.

clip?:boolean

If true, clamps output to [0, 1].

direction?:'higher' | 'lower'

Preferred direction for scoring.

to?:'0-1' | '0-100'

Target scale for output.

Example:

// Normalize latency relative to typical values
const normalizer = createZScoreNormalizer({
  mean: 150,
  stdDev: 50,
  direction: 'lower', // Lower latency is better
});

`createThresholdNormalizer()`

Converts numeric values to binary scores based on a threshold. Returns one score for values at or above the threshold, another for values below.

threshold:number

Threshold value for comparison.

above?:numberDefault: 1.0

Score for values >= threshold.

below?:numberDefault: 0.0

Score for values < threshold.

Example:

// Binary pass/fail at 0.7 threshold
const normalizer = createThresholdNormalizer({ threshold: 0.7 });

`createLinearNormalizer()`

Applies a linear transformation (slope + intercept) to values. Useful for custom scaling when min-max isn't appropriate.

slope:number

Slope of the linear transformation.

intercept:number

Intercept of the linear transformation.

clip?:[number, number]

Optional clamp range [min, max].

direction?:'higher' | 'lower'

Preferred direction for scoring.

Example:

// Custom linear scaling
const normalizer = createLinearNormalizer({
  slope: 0.2,
  intercept: 0,
  clip: [0, 1],
});

`createOrdinalMapNormalizer()`

Maps ordinal (categorical) values to numeric scores. Use this for string/enum metrics like "Poor", "Fair", "Good", "Excellent".

map:Record<string, number>

Mapping from ordinal values to scores in [0, 1].

Example:

// Map quality grades to scores
const normalizer = createOrdinalMapNormalizer({
  map: {
    'Poor': 0,
    'Fair': 0.33,
    'Good': 0.67,
    'Excellent': 1,
  },
});

`createIdentityNormalizer()`

Returns the raw value unchanged. Use this when your metric already produces values in the 0–1 range. For booleans, converts true → 1 and false → 0.

No options required.

Example:

const normalizer = createIdentityNormalizer();

`createCustomNormalizer()`

Creates a normalizer with custom logic. Use this when built-in normalizers don't fit your use case.

normalize:(value: T, args: { context: C; metric: MetricDef }) => Score

Custom function that returns a Score in [0, 1].

Example:

// Logarithmic normalization for response length
const normalizer = createCustomNormalizer({
  normalize: (value) => {
    const maxLength = 1000;
    return Math.min(1, Math.log(value + 1) / Math.log(maxLength + 1));
  },
});

Using Normalizers

Attach normalizers to metrics via the normalization field or using withNormalization:

import { defineBaseMetric, defineSingleTurnLLM, withNormalization } from '@tally-evals/tally';
import { createMinMaxNormalizer } from '@tally-evals/tally/normalization';
import { google } from '@ai-sdk/google';

// Option 1: In the base metric
const base = defineBaseMetric({
  name: 'quality',
  valueType: 'number',
  normalization: {
    normalizer: createMinMaxNormalizer({ min: 0, max: 5, clip: true }),
  },
});

// Option 2: Using withNormalization
const baseMetric = defineBaseMetric({
  name: 'quality',
  valueType: 'number',
});

const normalizedBase = withNormalization({
  metric: baseMetric,
  normalizer: createMinMaxNormalizer({ min: 0, max: 5, clip: true }),
});

Calibration

Some normalizers need dataset-level statistics (e.g., actual min/max or mean/stdDev). Use the calibrate option:

const metric = defineSingleTurnCode({
  base: defineBaseMetric({ name: 'latency', valueType: 'number' }),
  compute: ({ data }) => data.latencyMs,
  normalization: {
    normalizer: createMinMaxNormalizer({ min: 0, max: 1000 }),
    // Calibrate from actual data
    calibrate: async ({ rawValues }) => ({
      range: {
        min: Math.min(...rawValues),
        max: Math.max(...rawValues),
      },
    }),
  },
});

Normalizers

On this page