Tutorial: The Reporting Pipeline End to End¶
What You’ll Build¶
This is a hands-on tour of axiom-graph on one small, real codebase. By the end you’ll have watched the three things that make the tool worth running work together on the same data: intent-scoped retrieval (reading exactly the node you need instead of a whole file or doc), drift detection (the graph noticing when code and its docs/tests fall out of sync), and verification (clearing that drift so the graph stays trustworthy).
The sample is a reporting pipeline — a tiny three-function module that loads a CSV, normalizes and filters it, and exports a JSON report. It’s deliberately small so you can hold the whole mesh in your head while watching it react to change.
The pipeline lives in reporting/pipeline.py:
def load_data(path: str) -> pd.DataFrame:
"""Read the raw CSV from disk."""
return pd.read_csv(path)
def transform(df: pd.DataFrame) -> pd.DataFrame:
"""Normalize columns and filter rows where status == 'active'."""
df = df.rename(columns=str.lower)
return df[df['status'] == 'active']
def export_report(df: pd.DataFrame, dest: str) -> None:
"""Write the filtered DataFrame to JSON."""
df.to_json(dest, orient='records')
Around that code sit the artifacts every real project accumulates: three unit tests (one per function), one end-to-end test decorated with @workflow, a developer design spec with a section per function, and a consumer user guide. The consumer guide is linked through the dev spec rather than straight at the code — the same proxy pattern this documentation site uses, which you’ll see pay off in Scenario 4.
If you haven’t installed and connected axiom-graph yet, start with Use the CLI and Connect Your Agent, then come back here.
Step 1 — Build the Mesh¶
From the project root, run one command:
axiom-graph build .
The scanner walks the code and the docs and records what it finds as a typed graph — nodes joined by intent-typed edges. That graph is the product; everything else in this tutorial is a read or a write against it.
Nodes¶
3 code nodes —
reporting.pipeline.load_data,.transform,.export_report(one per function, not one per file — you can read and reason about a single function in isolation)4 test nodes —
test_load_data_reads_csv,test_transform_filters_inactive,test_export_report_writes_json,test_reporting_pipeline_e2e1 workflow envelope node —
test_reporting_pipeline_e2e@workflow, the orchestration highway the@workflowdecorator annotates6 doc-section nodes — 3 under the dev design spec (one per function) and 3 under the consumer guide (intro, guide, example). Each section is its own node, so docs are addressable at the same granularity as code.
Edges¶
From |
Edge |
To |
How it’s discovered |
|---|---|---|---|
Each unit test |
|
Its production function |
Auto — AST call-graph scan of the test body |
E2E test |
|
All three production functions |
Auto |
Each dev-spec section |
|
Its production function |
Author-declared in DocJSON |
Consumer-guide section |
|
Dev-spec section |
Author-declared, tagged for transitive propagation |
E2E workflow envelope |
|
E2E test function |
|
E2E workflow envelope |
|
Each production function |
|
Notice the edges carry intent, not just connectivity: validates is different from documents is different from annotates. That typing is what lets drift detection (next steps) and intent-scoped retrieval (Step 2) be two reads against the same mesh rather than two separate tools. The semantic layer here — @workflow + AutoStep — is not a full call graph; it annotates only the orchestration highway with step names and intent, which is why the E2E test grows a workflow envelope but the plain unit tests don’t.
Confirm everything is in sync:
axiom-graph check .
Every node reports VERIFIED. No drift, no stale flags. Green baseline established — now we change something.
Step 2 — Pull Only the Context You Need¶
Before changing the code, get oriented the cheap way. The reflex on an unfamiliar codebase is to open files and scroll, or to grep and read the hits. The mesh lets you load far less.
Read one function, not the file. Ask the graph for the body of a single node by ID and you get just that function’s source, by line range:
axiom_graph_source(node_id="proj::reporting.pipeline::transform")
Read one doc section, not the whole doc. The same applies to documentation — pull a single section by its slug instead of the entire guide:
axiom_graph_read_doc(doc_id="proj::docs.design.reporting-pipeline", section="transform")
Traverse to exactly the linked nodes. To learn what depends on transform before you touch it, follow the edges rather than guessing at call sites:
axiom_graph_graph(node_id="proj::reporting.pipeline::transform", direction="in")
That returns the three tests that validates it and the dev-spec section that documents it — the precise blast radius of an edit, with no false positives from a text search and nothing extra loaded into context. This is the core value for an agent: it arrives at the relevant function, its tests, and its docs having read a few hundred tokens instead of a few thousand. The CLI and viz offer the same traversal for humans, but the MCP server is the primary surface — agents query the mesh directly. See Connect Your Agent for the wiring.
Now you know exactly what a change to transform will touch. Let’s make one.
Step 3 — Change Code and Watch Drift Propagate¶
The change: fix a bug in transform(). The old filter, status == 'active', drops trial users who should be included. Change it to status.isin(['active', 'trial']). Body only — the docstring stays as it is for now.
Rebuild and check:
axiom-graph build .
axiom-graph check .
own: 1 CONTENT_UPDATED / 0 DESC_UPDATED / 0 RENAMED / 0 NOT_FOUND · link: 5 LINKED_STALE / 0 BROKEN_LINK · 11 VERIFIED
transform CONTENT_UPDATED
test_transform_filters_inactive LINKED_STALE via reporting.pipeline.transform
test_reporting_pipeline_e2e LINKED_STALE via reporting.pipeline.transform
docs.design.reporting-pipeline::transform LINKED_STALE via reporting.pipeline.transform
docs.consumer.generating-reports::example LINKED_STALE via docs.design.reporting-pipeline::transform
test_reporting_pipeline_e2e@workflow LINKED_STALE via reporting.pipeline.transform
One node changed its own content (CONTENT_UPDATED), and that signal travelled outward along the edges you traced in Step 2 — the same mesh, read for drift instead of retrieval, which is why it can name why each node flipped:
the unit test and the E2E test, each 1 hop via
validatesthe dev-spec section, 1 hop via
documentsthe consumer-doc example, 2 hops via transitive
documents(consumer doc → dev spec → code) — the consumer guide rides the proxy chain even though it never links to code directlythe workflow envelope, via
annotates+delegates_to
The via breadcrumb is the payoff: a call-graph tool flags only the tests; a markdown doc-sync tool misses the transitive consumer doc; neither tells you the reason. The mesh catches all five and points at the offender. That LINKED_STALE state is sticky on purpose — editing the linked file won’t silently clear it. Only an explicit verification does, which is Step 5. (More on the model in staleness.)
Step 4 — Edit a Docstring and Watch Drift Stay Bounded¶
Not every change should set off the same alarm. Reset to the Step 3 state, then this time change only the docstring on transform() — from "Normalize columns and filter rows." to "Rename columns to lowercase and filter rows where status is 'active' or 'trial'." Leave the body alone. Rebuild and check:
own: 0 CONTENT_UPDATED / 1 DESC_UPDATED / 0 RENAMED / 0 NOT_FOUND · link: 1 LINKED_STALE / 0 BROKEN_LINK · 15 VERIFIED
transform DESC_UPDATED
test_reporting_pipeline_e2e@workflow LINKED_STALE via reporting.pipeline.transform
The difference is the whole point. A docstring edit produces DESC_UPDATED, not CONTENT_UPDATED, and it does not storm-cloud through the call graph. The unit test’s validates edge stays green; so does the dev-spec documents edge and the consumer-doc transitive link. The behavior didn’t change, so the things that assert on behavior don’t need a second look.
The one node that does flip is the workflow envelope — because a @workflow(purpose=...) decorator can reference the function’s description in its own metadata, so a description change is a legitimate reason to re-read it. That’s a catch a plain text search can’t make: it requires knowing the type of the relationship.
The practical effect: your cleanup is two acknowledgements, not fifteen. axiom-graph distinguishes “the code does something different” from “we described it better,” and only the first kind cascades. This precision is what keeps drift detection a trustworthy signal instead of noise people learn to ignore.
Step 5 — Verify to Clear the Drift¶
Take the behavioral change from Step 3 — the five LINKED_STALE nodes — and resolve them. Verification is how drift gets cleared: you bring each flagged artifact back into agreement with the code, then record that you did. Working down the via list:
Unit test. Update
test_transform_filters_inactiveto expect both active and trial rows. Run it; it passes. Mark it verified:axiom-graph mark-clean proj::tests::test_transform_filters_inactive .
Dev-spec section. The prose still says “filters rows where status == ‘active’” — now wrong. Patch it via
axiom_graph_update_sectionto “filters rows where status is ‘active’ or ‘trial’.” Editing a DocJSON section through the write tools auto-records verification at the new hash, so this section clears without a separatemark-clean.Consumer-guide example. Its expected output showed only active rows. Refresh the example.
E2E test + workflow envelope. Run the E2E test, confirm it passes, and
mark-cleanthe test and its envelope.
Now axiom-graph check . is green again. The distinction between own_status (a node’s own content) and link_status (its links) matters here: a CONTENT_UPDATED or DESC_UPDATED node — your intentional code change — is promotable with mark-clean, because you are vouching that the new content is correct. A LINKED_STALE node clears only when you verify that section itself with mark-clean — not by marking the upstream code clean — so update the prose or test it points at, then verify it. The mesh won’t let you wave away drift by asserting a stale doc is fine — you fix the doc.
This is the loop that keeps the graph honest — and the loop the consumer docs you’re reading run through too, via the same dev-doc proxy. See the docs honesty loop for that story end to end.
Step 6 — Let an Agent Draft the Update¶
Steps 3 and 5 were the manual loop. With an agent connected over MCP, the same work compresses into a reviewable pull request. Make the Step 3 bug fix inside Claude Code (or Cursor, or any MCP client), then say:
“Draft updates to any docs and tests that went stale.”
The agent works the mesh the same way you did, just faster:
Reads
axiom_graph_checkand sees the fiveLINKED_STALEnodesFor each, calls
axiom_graph_sourcefor the current code,axiom_graph_read_docfor the linked dev-spec section, andaxiom_graph_diffto see exactly what changed intransform— pulling only the context it needs, not loading the repoDrafts new section text via
axiom_graph_update_sectionEdits the unit-test assertions directly
Runs the suite and confirms green
Opens a PR: the code change, the updated test, the patched dev-spec section, the refreshed consumer example, and a comment listing which nodes are ready to verify after review
What lands in front of the human is one ordinary PR with diffs. You review the doc edits the same way you review code — at PR time, in context — and merge if they’re right or ask the agent to redo a section if they’re not. On merge, CI runs axiom-graph check and confirms zero LINKED_STALE remain.
This is the flywheel: the agent drafts, the human reviews, and mesh density grows with every PR instead of decaying between doc sprints. The graph is the substrate that makes it possible — an agent without it makes the code change and never knows the linked docs exist. For the deeper version of agents consuming the mesh to do real work, see the PEV nexus.
Step 7 — Rename a Function Without Breaking the Mesh¶
One change reliably breaks naive link tracking: moving a symbol. Rename transform() to normalize() to match newer conventions — body unchanged, just the name. Rebuild:
axiom-graph build .
axiom-graph check .
own: 0 CONTENT_UPDATED / 0 DESC_UPDATED / 1 RENAMED / 0 NOT_FOUND · link: 0 LINKED_STALE / 0 BROKEN_LINK · 15 VERIFIED
Zero broken links. The build’s rename matcher noticed that reporting.pipeline.transform disappeared and reporting.pipeline.normalize appeared with a near-identical body, and re-homed every edge — the three validates edges, the documents edge, and the annotates + delegates_to from the workflow envelope — onto the new identity. A pure rename with an unchanged body is caught by an exact-hash pre-pass before any scoring runs; otherwise git narrows the candidate pool and a body-similarity score (default threshold 0.6) makes the call. The new node reports RENAMED rather than a fresh creation, and the dev-spec section still reads “transform” in its prose but already points at normalize.
This is exactly why consumer docs link through a dev-doc proxy instead of at raw code: the link target is stable under renames, so the page you’re reading doesn’t break when a symbol moves underneath it.
Acknowledge the rename and you’re green:
axiom-graph mark-clean proj::reporting.pipeline::normalize .
When detection needs a hand¶
If a rename lands together with a big body rewrite, similarity can fall below threshold: the old node goes NOT_FOUND, the new one looks newly created, and a RENAME_SCORING_SKIPPED history event flags it as a rename candidate. Weld it manually:
axiom-graph rename apply proj::reporting.pipeline::transform proj::reporting.pipeline::normalize .
The escape hatch enforces a safety contract — old_id must be NOT_FOUND and new_id a fresh live node — and is symmetric: axiom-graph rename revert proj::reporting.pipeline::normalize . un-welds it cleanly. Both have MCP equivalents (axiom_graph_apply_rename / axiom_graph_revert_rename) so an agent can do the same.
Rename-awareness is what keeps drift detection from drowning in false positives every time someone tidies a name — the signal stays about real divergence.
What You Just Saw¶
Seven steps on one tiny module, and the three capabilities never operated in isolation:
Intent-scoped retrieval (Step 2) read one function, one doc section, and the exact dependents of a node — the same traversal an agent uses to keep its working set small.
Drift detection rode those identical edges: a body change cascaded full-mesh through
validates,documents, transitivedocuments,annotates, anddelegates_to(Step 3); a docstring edit stayed bounded to the one node that referenced it (Step 4); a rename re-homed every edge with nothing broken (Step 7).Verification (Step 5) cleared the drift — and the model’s refusal to let you
mark-cleanaLINKED_STALEnode is what forces docs and tests to actually catch up, keeping the graph trustworthy over time.
The through-line is that there is one mesh. The graph you query for context is the graph that detects drift is the graph you verify against. That’s why the alternatives each miss something a piece at a time — a call graph plus git blame ignores the docs/tests half entirely; a doc-sync tool flags the dev spec but not the transitive consumer doc or the workflow layer; a hosted docs host happily publishes the stale version. And it’s why an agent (Step 6) can draft real doc and test updates: the mesh is the substrate that tells it what went stale and why.
Where to go next¶
The mesh — nodes and intent-typed edges, the model underneath everything here
Staleness — the full drift vocabulary and the sticky
LINKED_STALEruleUse the CLI and Connect Your Agent — run these steps on your own project
The docs honesty loop — how this very site is built the way this tutorial describes