Graph API
Build CSR relationship indexes, traverse neighbors, find shortest paths, and run worst-case optimal joins via the Rust Graph API.
Teide's Graph API operates on the same lazy DAG as all other operations. You build a Rel (compressed sparse row index) from table columns, then call traversal operations on a Graph. Everything is optimized and executed together when you call execute().
Rel — Building the Index
A Rel is a CSR-encoded adjacency index between two sets of nodes. There are two ways to build one:
From a foreign key column
Use Rel::build() when one table has a column of I64 row offsets pointing into a target table:
use teide::{Context, Rel};
let ctx = Context::new()?;
let edges = ctx.read_csv("edges.csv")?; // columns: src, dst, weight
// Build a CSR from the "dst" foreign key column.
// n_target_nodes = number of rows in the target table.
let rel = Rel::build(&edges, "dst", 1000, true)?;
| Parameter | Type | Description |
|---|---|---|
from_table | &Table | Table containing the foreign key column |
fk_col | &str | Name of the I64 column with target row offsets |
n_target_nodes | i64 | Number of rows in the target table (upper bound on IDs) |
sort | bool | Sort target lists for deterministic traversal order |
From an edge table
Use Rel::from_edges() when you have explicit source/destination columns:
let edges = ctx.read_csv("friendships.csv")?; // columns: person_a, person_b
let rel = Rel::from_edges(
&edges,
"person_a", // source column
"person_b", // destination column
1000, // number of source nodes
1000, // number of destination nodes
true, // sort targets
)?;
| Parameter | Type | Description |
|---|---|---|
edge_table | &Table | Table with source and destination columns |
src_col | &str | Name of the I64 source node column |
dst_col | &str | Name of the I64 destination node column |
n_src | i64 | Number of source nodes |
n_dst | i64 | Number of destination nodes |
sort | bool | Sort target lists for deterministic traversal order |
Rel — Persistence
Relationship indexes can be saved to disk and loaded back, avoiding the cost of rebuilding from raw data:
// Save to disk
rel.save("/data/indexes/friends")?;
// Load from disk (full copy into memory)
let rel = Rel::load("/data/indexes/friends")?;
// Memory-map from disk (zero-copy, pages on demand)
let rel = Rel::mmap("/data/indexes/friends")?;
| Method | Description |
|---|---|
save(dir) | Serialize the CSR to the given directory |
load(dir) | Deserialize from disk into memory |
mmap(dir) | Memory-map from disk (zero-copy, OS pages data on demand) |
Tip: Use mmap for large relationship indexes. The OS will page in only the portions accessed during traversal, keeping memory usage proportional to query scope rather than index size.
Direction Constants
Traversal operations accept a direction parameter controlling which edges are followed:
| Constant | Value | Meaning |
|---|---|---|
TD_DIR_FWD | 0 | Follow forward edges only (src → dst) |
TD_DIR_REV | 1 | Follow reverse edges only (dst → src) |
TD_DIR_BOTH | 2 | Follow edges in both directions |
use teide::ffi::{TD_DIR_FWD, TD_DIR_REV, TD_DIR_BOTH};
Multi-Table Support
A Graph is created from one primary table, but you can register additional tables and scan their columns in the same computation:
let ctx = Context::new()?;
let people = ctx.read_csv("people.csv")?;
let posts = ctx.read_csv("posts.csv")?;
let mut g = ctx.graph(&people)?;
// Register the posts table; get back a table ID
let posts_id = g.add_table(&posts);
// Scan columns from the secondary table
let post_title = g.scan_table(posts_id, "title")?;
let post_author = g.scan_table(posts_id, "author_id")?;
| Method | Returns | Description |
|---|---|---|
add_table(&table) | u16 | Register a table and return its ID for use with scan_table |
scan_table(id, col) | Column | Scan a named column from a registered table |
Graph Traversal
expand — 1-hop neighbor expansion
Expand each source node to its immediate neighbors via the CSR index:
let ctx = Context::new()?;
let people = ctx.read_csv("people.csv")?;
let edges = ctx.read_csv("follows.csv")?;
let rel = Rel::from_edges(&edges, "src", "dst", 1000, 1000, true)?;
let mut g = ctx.graph(&people)?;
let src = g.scan("id")?;
let neighbors = g.expand(src, &rel, TD_DIR_FWD)?;
let result = g.execute(neighbors)?;
| Parameter | Type | Description |
|---|---|---|
src_nodes | Column | I64 column of source node IDs |
rel | &Rel | CSR relationship index |
direction | u8 | TD_DIR_FWD, TD_DIR_REV, or TD_DIR_BOTH |
Output columns:
| Column | Type | Description |
|---|---|---|
_src | I64 | Source node ID |
_dst | I64 | Destination node ID (neighbor) |
var_expand — Variable-length BFS
Breadth-first traversal from start nodes, expanding between min_depth and max_depth hops:
let start = g.scan("id")?;
let reached = g.var_expand(
start,
&rel,
TD_DIR_FWD,
1, // min_depth
3, // max_depth
false, // track_path
)?;
let result = g.execute(reached)?;
| Parameter | Type | Description |
|---|---|---|
start | Column | I64 column of starting node IDs |
rel | &Rel | CSR relationship index |
direction | u8 | Edge direction |
min_depth | u8 | Minimum number of hops (inclusive) |
max_depth | u8 | Maximum number of hops (inclusive) |
track_path | bool | Whether to record full traversal paths |
Output columns:
| Column | Type | Description |
|---|---|---|
_start | I64 | Starting node ID |
_end | I64 | Reached node ID |
_depth | I64 | Number of hops from start to end |
shortest_path — BFS shortest path
Find the shortest path between source and destination nodes via BFS:
let src = g.const_i64(0)?; // start node
let dst = g.const_i64(42)?; // target node
let path = g.shortest_path(src, dst, &rel, 10)?;
let result = g.execute(path)?;
| Parameter | Type | Description |
|---|---|---|
src | Column | Source node ID(s) |
dst | Column | Destination node ID(s) |
rel | &Rel | CSR relationship index |
max_depth | u8 | Maximum search depth |
Output columns:
| Column | Type | Description |
|---|---|---|
_node | I64 | Node ID along the path |
_depth | I64 | Hop distance from the source |
wco_join — Worst-case optimal join
Leapfrog Triejoin across multiple relationships. Finds all n-cliques or multi-way join results in worst-case optimal time:
// Find triangles: nodes (a, b, c) where a->b, b->c, a->c
let rel_ab = Rel::from_edges(&edges, "src", "dst", n, n, true)?;
let rel_bc = Rel::from_edges(&edges, "src", "dst", n, n, true)?;
let rel_ac = Rel::from_edges(&edges, "src", "dst", n, n, true)?;
let triangles = g.wco_join(&[&rel_ab, &rel_bc, &rel_ac], 3)?;
let result = g.execute(triangles)?;
| Parameter | Type | Description |
|---|---|---|
rels | &[&Rel] | Slice of relationship indexes to join |
n_vars | u8 | Number of join variables (output columns) |
Output columns:
| Column | Type | Description |
|---|---|---|
_v0, _v1, ..., _vN | I64 | Node IDs for each join variable |
Constrained Traversal
There is no single "constrained traversal" primitive. Instead, you compose constraints by combining the Rel (structural), direction, depth bounds, and filter (predicate). All constraints are part of the same lazy DAG and are optimized together at execute() time.
Constraint types
| Constraint | Mechanism | Applied |
|---|---|---|
| Which edges exist | Rel::build / Rel::from_edges | Before traversal (index construction) |
| Edge direction | direction parameter | At traversal time |
| Hop depth | min_depth / max_depth | At traversal (var_expand, shortest_path) |
| Source node properties | filter(col, predicate) before traversal | Pre-traversal |
| Result properties | filter(col, predicate) after traversal | Post-traversal |
| Structural patterns | wco_join (Leapfrog Triejoin) | Multi-way pattern matching |
Pre-traversal filtering
Constrain which nodes start the traversal by filtering source IDs with a predicate before passing them to expand or var_expand:
let g = ctx.graph(&people)?;
// Scan columns from the people table
let id = g.scan("id")?;
let age = g.scan("age")?;
// Constraint: only start from people older than 30
let threshold = g.const_i64(30)?;
let pred = g.gt(age, threshold)?;
let old_ids = g.filter(id, pred)?;
// Traverse 1 hop from only those constrained start nodes
let neighbors = g.expand(old_ids, &rel, TD_DIR_FWD)?;
let result = g.execute(neighbors)?;
Multiple predicates can be composed with and / or:
let age = g.scan("age")?;
let city = g.scan("city")?;
// age > 30 AND city = "NYC"
let age_pred = g.gt(age, g.const_i64(30)?)?;
let city_pred = g.eq(city, g.const_str("NYC")?)?;
let combined = g.and(age_pred, city_pred)?;
let start_ids = g.filter(g.scan("id")?, combined)?;
Depth-constrained traversal
Use min_depth and max_depth on var_expand to bound how far the BFS goes:
// Exactly 2 hops (friends-of-friends, excluding direct friends)
let fof = g.var_expand(start, &rel, TD_DIR_FWD, 2, 2, false)?;
// Reachable within 1–5 hops in either direction
let reachable = g.var_expand(start, &rel, TD_DIR_BOTH, 1, 5, false)?;
// Shortest path with a maximum depth of 10
let path = g.shortest_path(src, dst, &rel, 10)?;
Post-traversal filtering
Constrain traversal results by executing the traversal first, then building a new graph on the result table to apply property predicates on destination nodes:
// Step 1: execute the traversal
let g = ctx.graph(&people)?;
let start = g.scan("id")?;
let reached = g.var_expand(start, &rel, TD_DIR_FWD, 1, 3, false)?;
let traversal_result = g.execute(reached)?;
// traversal_result columns: _start, _end, _depth
// Step 2: build a new graph on the result to filter by depth
let g2 = ctx.graph(&traversal_result)?;
let depth = g2.scan("_depth")?;
let pred = g2.le(depth, g2.const_i64(2)?)?;
// Project only the columns we need, filtered
let start_col = g2.filter(g2.scan("_start")?, pred)?;
let end_col = g2.filter(g2.scan("_end")?, pred)?;
let depth_col = g2.filter(depth, pred)?;
let projected = g2.select(start_col, &[start_col, end_col, depth_col])?;
let filtered = g2.execute(projected)?;
Enriching results with destination properties
After traversal, use add_table and scan_table to look up properties of destination nodes. Since _dst / _end values are row offsets into the target table, you can use them to index into that table's columns:
// After expand: result has _src, _dst (row offsets into people)
let g = ctx.graph(&people)?;
let start = g.scan("id")?;
let neighbors = g.expand(start, &rel, TD_DIR_FWD)?;
let result = g.execute(neighbors)?;
// Build a new graph to enrich _dst with people properties
let g2 = ctx.graph(&result)?;
let people_id = g2.add_table(&people);
let dst_name = g2.scan_table(people_id, "name")?; // look up name
let dst_age = g2.scan_table(people_id, "age")?; // look up age
// Filter: only neighbors younger than 25
let pred = g2.lt(dst_age, g2.const_i64(25)?)?;
let dst = g2.filter(g2.scan("_dst")?, pred)?;
let enriched = g2.execute(dst)?;
Combining all constraints
The full pattern: pre-filter source nodes, traverse with depth bounds, then post-filter results.
use teide::{Context, Rel};
use teide::ffi::TD_DIR_FWD;
let ctx = Context::new()?;
let people = ctx.read_csv("people.csv")?; // id, name, city, age
let follows = ctx.read_csv("follows.csv")?; // src, dst
let n = people.nrows() as i64;
// Structural constraint: build the CSR index
let rel = Rel::from_edges(&follows, "src", "dst", n, n, true)?;
let g = ctx.graph(&people)?;
let id = g.scan("id")?;
let city = g.scan("city")?;
// Pre-traversal constraint: start from NYC residents only
let nyc_pred = g.eq(city, g.const_str("NYC")?)?;
let nyc_ids = g.filter(id, nyc_pred)?;
// Depth + direction constraints: 1–3 hops, forward only
let reached = g.var_expand(nyc_ids, &rel, TD_DIR_FWD, 1, 3, false)?;
// Execute the constrained traversal
let result = g.execute(reached)?;
// result columns: _start, _end, _depth
Tip: Because the graph is a lazy DAG, pre-traversal filters are composed into the same execution plan as the traversal itself. The optimizer can push predicates down, fuse operators, and eliminate dead code — so building the constraint in multiple steps has no performance penalty compared to a monolithic operation.
Complete Example
End-to-end social network scenario: load data, build a relationship index, find friends-of-friends reachable from users in a specific city, and read the results.
use teide::{Context, Rel};
use teide::ffi::TD_DIR_FWD;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let ctx = Context::new()?;
// Load a people table and an edge table
let people = ctx.read_csv("people.csv")?; // id, name, city, age
let follows = ctx.read_csv("follows.csv")?; // src, dst
let n = people.nrows() as i64;
// Build a CSR relationship index from the edge table
let rel = Rel::from_edges(&follows, "src", "dst", n, n, true)?;
// Optionally persist for reuse
rel.save("/tmp/follows_rel")?;
// Create a graph rooted on the people table
let g = ctx.graph(&people)?;
// Pre-traversal constraint: only start from NYC residents
let id = g.scan("id")?;
let city = g.scan("city")?;
let pred = g.eq(city, g.const_str("NYC")?)?;
let nyc_ids = g.filter(id, pred)?;
// Depth-constrained BFS: friends-of-friends (exactly 2 hops)
let fof = g.var_expand(nyc_ids, &rel, TD_DIR_FWD, 2, 2, false)?;
// Execute the computation graph
let result = g.execute(fof)?;
// Read the output columns
let nrows = result.nrows();
println!("Found {} friend-of-friend pairs from NYC", nrows);
for i in 0..nrows.min(10) {
let start = result.get_i64(0, i as usize).unwrap();
let end = result.get_i64(1, i as usize).unwrap();
let depth = result.get_i64(2, i as usize).unwrap();
println!(" {} -> {} (depth {})", start, end, depth);
}
Ok(())
}
API Reference Summary
Rel methods
| Method | Description |
|---|---|
Rel::build(table, fk_col, n_target, sort) | Build CSR from a foreign key column |
Rel::from_edges(table, src, dst, n_src, n_dst, sort) | Build CSR from an edge table |
Rel::load(dir) | Load CSR from disk |
Rel::mmap(dir) | Memory-map CSR from disk |
rel.save(dir) | Save CSR to disk |
Graph traversal methods
| Method | Output columns | Description |
|---|---|---|
g.expand(src, rel, dir) | _src, _dst | 1-hop neighbor expansion |
g.var_expand(start, rel, dir, min, max, path) | _start, _end, _depth | Variable-length BFS |
g.shortest_path(src, dst, rel, max) | _node, _depth | BFS shortest path |
g.wco_join(rels, n_vars) | _v0, _v1, ... | Leapfrog Triejoin |
Multi-table methods
| Method | Returns | Description |
|---|---|---|
g.add_table(table) | u16 | Register a secondary table |
g.scan_table(id, col) | Column | Scan a column from a registered table |