Applied Models

Notebook

0001 Agentic Evals Baseline Notebook

A Python notebook for prompt fixtures, scoring checks, and baseline observations for the first experiment.

2026-02-28 · Python notebook

Download .ipynb Open in Colab

0001 Agentic Evals Baseline

This notebook is the working scratchpad for the first experiment. It exists to keep the baseline logic simple, visible, and easy to rerun.

Python

prompt_count = 4
rubric_version = "draft-v0.1"
print(f"Prompts loaded: {prompt_count}")
print(f"Scoring rubric: {rubric_version}")

Prompts loaded: 4\nScoring rubric: draft-v0.1\n

Immediate next steps

Add prompt fixtures.
Record baseline outputs.
Convert failures into explicit scoring checks.