The Ontology¶
Node Types¶
axiom-graph does not model your codebase as “files and symbols.” It models it as processes and entities, using a type system derived from the W3C Provenance Ontology (PROV-O). This is a deliberate, principled choice: a node type describes the role a thing plays in getting work done, not the language construct it happens to be written in. A Python function, a JavaScript arrow function, an axiom-annotation @task, and a doc section are all the same shape of thing — a unit of work — so they share a type.
There are exactly three structural node types. (The model started with five; ADR-009 collapsed document and constraint into the process/entity split, because almost everything in a codebase is either something that does work or something that work consumes or produces.)
Node Type |
What it represents |
Examples |
|---|---|---|
atomic_process |
The smallest meaningful unit of work — a leaf with no sub-parts. |
A Python function, a test, a single doc section, a workflow step |
composite_process |
A container that orchestrates or holds other processes. Scale-invariant. |
A module, a package, a doc file, a config file, a |
entity |
Data consumed or produced by a process — not logic. |
A third-party package, a data file, a trained model, a config artifact |
The key idea is scale-invariance for composite processes: a module, a package, a pipeline, and a @workflow envelope are all composite_process. The model does not care how big the container is, only that it contains other processes. Entities sit at the edges of the graph — they are always treated as CLEAN by staleness checks, because a data file has no logic to drift.
Concretely: a main() and the test that exercises it are both atomic_process; the module holding them and the @workflow that wraps it are both composite_process; the pandas they import is an entity.
This principled typing is what lets one mesh hold code, docs, tests, and orchestration together. Because everything is reduced to three shapes plus subtypes, an agent (or you) can traverse from a doc section straight to the function it documents without ever leaving the type system — that is the context-reduction payoff of the mesh.
Subtypes¶
Each of the three node types carries an optional subtype that records what kind of process or entity it is. The structural type drives validation and propagation; the subtype is the human- and agent-readable detail.
Subtype |
Node Type |
Meaning |
|---|---|---|
|
atomic_process |
A Python function or method, or a JS/TS function |
|
atomic_process |
A test function (identified by a |
|
atomic_process |
A single section within a DocJSON file |
|
atomic_process |
A |
|
atomic_process |
An |
|
atomic_process |
A leaf (simple) state in an xstate v5 machine |
|
composite_process |
A DocJSON file (contains sections via |
|
composite_process |
A configuration file (settings, hooks, skills) |
|
composite_process |
A function decorated with |
|
composite_process |
A function decorated with |
|
composite_process |
An xstate v5 |
|
composite_process |
A compound (non-leaf) xstate state with substates |
|
entity |
A third-party dependency outside the standard library |
|
entity |
A data file or dataset a workflow consumes or produces |
|
entity |
A configuration artifact (e.g. a hyperparameter set) |
|
entity |
A trained model produced by a workflow |
Note that some subtype names appear under more than one structural type. A docjson atomic_process is one section; a docjson composite_process is the whole file that composes those sections — granular documentation falls straight out of the type system, since each section is its own addressable node. Likewise state is an atomic_process when it is a leaf state and a composite_process when it nests substates.
The workflow, task, step, autostep, state_machine, and state subtypes belong to the semantic layer — the annotated orchestration highways covered in Annotations. They are how axiom-graph records intent and step order through a pipeline without trying to build a full call graph.
Edge Types¶
If node types are the nouns, edges are the verbs — and they are intent-typed. Each edge declares why two nodes are related, not merely that they reference each other. This is the single biggest difference between axiom-graph and a generic call graph or import graph: documents, validates, and delegates_to are first-class semantics, not labels stapled onto a raw reference.
Edge Type |
From → To |
Meaning |
|---|---|---|
|
composite → atomic / composite |
Structural containment: module → function, doc file → section, workflow → step |
|
process → process / entity |
Import or execution dependency |
|
process → process |
Runtime invocation across a boundary; also how an |
|
composite → atomic / composite |
A |
|
atomic → atomic / composite |
Test coverage: a test verifies a function |
|
process → any |
A doc section describes a node |
|
process → entity |
A process reads an entity as input |
|
process → entity |
A process creates an entity as output |
|
atomic → process / entity |
A schema or contract restricts behavior |
|
process → process |
A replacement obsoletes the thing it replaces |
The edges you will meet most often are composes, depends_on, documents, and validates. Because intent is encoded in the edge, the same typed mesh answers two very different questions with one read: “what does this depend on?” (traverse depends_on) and “what is now out of date?” (a staleness read that follows documents and validates). The mesh is the product; drift detection and intent-scoped retrieval are just two reads against it.
Edge intent also determines how far drift travels. Propagation depth is chosen by what an edge means, not by uniform graph distance — validates propagates one hop (test → production), documents is transitive (doc → doc → code), and annotates is one hop. That transitive documents chain is exactly what lets consumer docs like this one ride a code → dev-doc → consumer-doc path and inherit staleness when the underlying code changes.
Build-Time Validation¶
The type system is not advisory — it is enforced. Every node type, edge type, and the legal node-type pairings for each side of every edge are declared in a single file, axiom_graph/ontology.yaml. When you run a build, the builder checks each edge against these rules, and an edge whose endpoints violate the ontology fails the build.
This is what keeps the mesh trustworthy. You cannot, for example, point a validates edge from a doc section, or hang a composes edge off an atomic_process — the schema forbids it, so the graph can never drift into an incoherent shape. The rules are precise: composes may only go from a composite to a process; consumes and produces may only target an entity; annotates runs strictly from an envelope to its function.
Validation runs as part of the normal index build:
axiom-graph build
Because the ontology lives in one declarative YAML file rather than being scattered through scanner code, the type system is also where axiom-graph’s framework-awareness is extended. Teaching the indexer a new declarative framework means adding its node and edge shapes here — the xstate v5 scanner (which introduced the state_machine and state subtypes) is the proven example of that extensibility in action.
Language Support¶
The payoff of a process-centric type system is that languages share types. axiom-graph currently scans two languages, plus xstate v5 state machines as a concept extracted on top of JS/TS.
Language |
Scanner |
What it captures |
Install |
|---|---|---|---|
Python |
AST-based |
Functions, methods, classes, modules, packages, imports, docstrings, decorators, |
Built in — no extra install |
JavaScript / TypeScript |
tree-sitter |
Named functions, arrow functions, class and object methods, HOF wrappers, ESM imports, test-runner calls, xstate v5 machines |
|
Python support is comprehensive and included by default. The JS/TS scanner is an optional dependency: when tree-sitter is not installed, .js and .ts files are silently skipped rather than erroring.
The important point is that both languages feed the same three node types. A Python function and a JS/TS function both become an atomic_process with subtype function. A Python module and a JS/TS module both become a composite_process (with no subtype — the language is implicit in the file path, so the model needs no per-language node types). Tests, imports, and dependencies map onto the same validates, depends_on, and composes edges regardless of source language. The xstate scanner is the only place that adds language-specific subtypes — state_machine and state — and it does so by extending the ontology, not by forking it.
To turn on JS/TS scanning, point axiom-graph at the paths to scan in your axiom-graph.toml:
[axiom_graph.scan]
js_paths = ["src/**/*.ts", "lib/**/*.js"]
See Configuration for the full set of scan settings. Because the type schema is shared, a single graph traversal crosses language boundaries seamlessly — an agent following the mesh never has to know, or care, which language a given node was written in.