AI usage coach — prompt quality and token optimization

Ticket #224: Setting up an AI usage coach agent focused on prompt quality and token optimization
Type: Automation / Workflow / Governance
Affected Component: .github/agents/ai-coach.agent.md, .github/hooks/ai-coach-session-start.json, scripts/hooks/ai_coach_session_start.ps1, .github/prompts/ai-coach-report.prompt.md, logs/ai_coach_session_starts.jsonl

1. Context and objective

Over the past few weeks, the main concern around AI usage on this project had shifted. Initially, the focus was on prompt quality: formulating requests clearly to avoid wasted time, unnecessary back-and-forth, and misunderstandings with the AI assistant. That discipline had proven effective for development efficiency.

Gradually, a second dimension emerged: token consumption. A project of this scale generates long chat sessions, heavily loaded contexts, and iterative exchanges. Prompting effectively was no longer enough. The goal became optimizing each session to produce the same outcome using as few tokens as possible.

This session was therefore aimed at creating a coach agent capable of:

observing the quality of requests made, the level of context provided, and actions performed;
measuring and signaling token pressure exercised over each session;
producing a progress-oriented report with concrete recommendations, specifically to reduce token waste without sacrificing output quality.

2. Implemented solution

The agent was designed around two complementary layers, with token efficiency as the primary throughline.

Layer 1 — Observation

The agent analyzes each session across three axes:

request quality: clarity of objective, level of context provided, precision of constraints;
work progression: number of iterations, presence of avoidable rework, quality of the sequence (brief → execution → validation);
token pressure: waste signals actively detected — unnecessarily re-injected context, overly broad requests, incomplete briefs generating corrections, repeated reformulations of the same need.

Layer 2 — Coaching

Based on its observations, the agent produces on demand a structured report including:

an impact-oriented executive summary;
the session's strengths and weaknesses;
up to three actionable recommendations, prioritized by their token-reduction potential;
a scorecard by dimension, with token efficiency flagged as the priority dimension;
a qualitative token trend estimate when session history is available.

Activation infrastructure

To ensure the coach is operational without manual intervention, four components were configured:

.github/ai-coach.config.json — central configuration for the hybrid coach model: defines the default capture mode (events-only) and transcript retention settings;
scripts/hooks/ai_coach_session_start.ps1 — enriched startup hook: generates a unique identifier (session_id) at the opening of each new session and creates the dedicated session artifacts;
.github/agents/ai-coach.agent.md — agent configured in “one session at a time” mode: reads exclusively the active session file, with no access to previous sessions;
.github/prompts/ai-coach-report.prompt.md — prompt updated to target the active session only and trigger the report with a single command.

Concrete behavior

At every session startup, the following happens:

A new unique session_id is generated automatically.
The coach creates a dedicated events file for that session in logs/ai_coach_sessions/.
The active session is pointed to in logs/ai_coach_active_session.json.
When a report is requested, the coach reads that active session pointer first — with no mixing with previous sessions.
Full audit mode (complete transcript) is built into the architecture but disabled by default to limit risk and cost.

3. Validation and outcome

The infrastructure was validated in three steps:

Hook execution: the script ran successfully — valid JSON returned containing the unique session_id generated.
Artifact creation: a new active session was created in logs/ai_coach_sessions/ and its pointer recorded in logs/ai_coach_active_session.json.
Report generated in isolated mode: the ai-coach agent produced a report limited exclusively to the active session, with no access to previous sessions.

What is now possible:

at the start of each new session, an isolated observation space is created automatically — no manual intervention required;
the coach analyzes one session at a time, without contamination between sessions;
on demand, a report is generated with analysis targeted at high-token-impact behaviors for that specific session;
the recommendations received are directly actionable: better scoping of requests, avoiding unnecessary context, framing iterations from the initial brief.

This setup is the first building block of a structured approach to tracking progress as an AI user, with a specific focus on managing session costs.

4. Limits and points of attention

Indirect token measurement: the current tooling does not expose a real-time token counter. The coach therefore works on proxy signals — exchange length, number of iterations, request complexity — and produces a qualitative estimate. If explicit data becomes available, it will automatically be prioritized.
Session scope: the SessionStart hook applies to new sessions only. A session already open at the time of setup will not trigger automatic initialization.
Session-by-session analysis: by design, the coach evaluates one session at a time from its dedicated events. It does not produce automatic multi-session summaries; identifying trends across sessions requires an explicit request with context provided by the user.
Two available modes: the default mode (events-only) is lightweight, governed, and session-scoped — it captures only useful events. A full audit mode (complete transcript) is built into the architecture and can be enabled via a flag in .github/ai-coach.config.json, but remains disabled by default to limit risk and cost. It is designed for short, targeted periods.
Analysis depth conditional on captured events: in events-only mode, the depth of coaching depends on the events recorded in the session file. Sessions with few events will produce less precise recommendations.
User-driven reporting: the coach observes continuously but only produces a report on explicit request. This is a deliberate choice to avoid adding overhead to ongoing sessions.