Core Concepts

Understand Mesmer's public concepts: objectives, runs, techniques, operators, state, transitions, services, and benchmarks.

Mesmer separates technique definition from workload execution.

Objectives

Objectives describe what an experiment is trying to elicit or measure. Use ObjectiveSource.single(...) for one goal, local datasets for batches, or RemoteDatasetSource for pinned remote benchmark data.

Runs

A Run combines objectives, an attack technique, and a target. It is the smallest unit the runner can execute.

Techniques

A Technique is the user-facing attack recipe: Probe, BestOfNProbe, FrontierSearch, ConversationAgentProbe, PopulationFuzzing, or a custom algorithm. Built-in techniques infer their state schema from the operators they compose.

State

State is typed runtime memory. It is composed from slices such as objective, frontier, attempts, target responses, evaluations, constraints, population pool, rewards, feedback, stop signal, and metadata.

Operators

An Operator is one executable state transition. It declares which state slices it reads and writes, receives runtime state, and returns a patch. Common operators include proposal, transform application, constraint checking, filtering, target query, evaluation, feedback, stopping, and population generation.

Workflows

Workflow is the internal control algebra that composes operators into sequences and loops. Most users author techniques, not raw workflows.

Transitions

Every operator execution records a Transition: before snapshot, patch summary, after snapshot, events, artifacts, timing, and errors. This is the replay/debug layer.

Strategies And Services

Strategies such as proposers, transforms, selectors, evaluators, mutators, conditions, constraints, and sources configure operators. They do not execute alone.

Benchmarks

Benchmarks run many experiments with shared metrics such as success rate, turns, queries, and cost. Reports can also preserve an EvidenceMatrix for row-level attempt evidence and a BudgetCurve for success as target-call budget increases.

On this page