DEVELOPER

Graph API

Build CSR relationship indexes, traverse neighbors, find shortest paths, and run worst-case optimal joins via the Rust Graph API.

Teide's Graph API operates on the same lazy DAG as all other operations. You build a Rel (compressed sparse row index) from table columns, then call traversal operations on a Graph. Everything is optimized and executed together when you call execute().

Rel — Building the Index

A Rel is a CSR-encoded adjacency index between two sets of nodes. There are two ways to build one:

From a foreign key column

Use Rel::build() when one table has a column of I64 row offsets pointing into a target table:

use teide::{Context, Rel};

let ctx = Context::new()?;
let edges = ctx.read_csv("edges.csv")?;   // columns: src, dst, weight

// Build a CSR from the "dst" foreign key column.
// n_target_nodes = number of rows in the target table.
let rel = Rel::build(&edges, "dst", 1000, true)?;
ParameterTypeDescription
from_table&TableTable containing the foreign key column
fk_col&strName of the I64 column with target row offsets
n_target_nodesi64Number of rows in the target table (upper bound on IDs)
sortboolSort target lists for deterministic traversal order

From an edge table

Use Rel::from_edges() when you have explicit source/destination columns:

let edges = ctx.read_csv("friendships.csv")?;  // columns: person_a, person_b

let rel = Rel::from_edges(
    &edges,
    "person_a",   // source column
    "person_b",   // destination column
    1000,         // number of source nodes
    1000,         // number of destination nodes
    true,         // sort targets
)?;
ParameterTypeDescription
edge_table&TableTable with source and destination columns
src_col&strName of the I64 source node column
dst_col&strName of the I64 destination node column
n_srci64Number of source nodes
n_dsti64Number of destination nodes
sortboolSort target lists for deterministic traversal order

Rel — Persistence

Relationship indexes can be saved to disk and loaded back, avoiding the cost of rebuilding from raw data:

// Save to disk
rel.save("/data/indexes/friends")?;

// Load from disk (full copy into memory)
let rel = Rel::load("/data/indexes/friends")?;

// Memory-map from disk (zero-copy, pages on demand)
let rel = Rel::mmap("/data/indexes/friends")?;
MethodDescription
save(dir)Serialize the CSR to the given directory
load(dir)Deserialize from disk into memory
mmap(dir)Memory-map from disk (zero-copy, OS pages data on demand)

Tip: Use mmap for large relationship indexes. The OS will page in only the portions accessed during traversal, keeping memory usage proportional to query scope rather than index size.

Direction Constants

Traversal operations accept a direction parameter controlling which edges are followed:

ConstantValueMeaning
TD_DIR_FWD0Follow forward edges only (src → dst)
TD_DIR_REV1Follow reverse edges only (dst → src)
TD_DIR_BOTH2Follow edges in both directions
use teide::ffi::{TD_DIR_FWD, TD_DIR_REV, TD_DIR_BOTH};

Multi-Table Support

A Graph is created from one primary table, but you can register additional tables and scan their columns in the same computation:

let ctx = Context::new()?;
let people = ctx.read_csv("people.csv")?;
let posts = ctx.read_csv("posts.csv")?;

let mut g = ctx.graph(&people)?;

// Register the posts table; get back a table ID
let posts_id = g.add_table(&posts);

// Scan columns from the secondary table
let post_title = g.scan_table(posts_id, "title")?;
let post_author = g.scan_table(posts_id, "author_id")?;
MethodReturnsDescription
add_table(&table)u16Register a table and return its ID for use with scan_table
scan_table(id, col)ColumnScan a named column from a registered table

Graph Traversal

expand — 1-hop neighbor expansion

Expand each source node to its immediate neighbors via the CSR index:

let ctx = Context::new()?;
let people = ctx.read_csv("people.csv")?;
let edges = ctx.read_csv("follows.csv")?;

let rel = Rel::from_edges(&edges, "src", "dst", 1000, 1000, true)?;

let mut g = ctx.graph(&people)?;
let src = g.scan("id")?;
let neighbors = g.expand(src, &rel, TD_DIR_FWD)?;
let result = g.execute(neighbors)?;
ParameterTypeDescription
src_nodesColumnI64 column of source node IDs
rel&RelCSR relationship index
directionu8TD_DIR_FWD, TD_DIR_REV, or TD_DIR_BOTH

Output columns:

ColumnTypeDescription
_srcI64Source node ID
_dstI64Destination node ID (neighbor)

var_expand — Variable-length BFS

Breadth-first traversal from start nodes, expanding between min_depth and max_depth hops:

let start = g.scan("id")?;
let reached = g.var_expand(
    start,
    &rel,
    TD_DIR_FWD,
    1,     // min_depth
    3,     // max_depth
    false, // track_path
)?;
let result = g.execute(reached)?;
ParameterTypeDescription
startColumnI64 column of starting node IDs
rel&RelCSR relationship index
directionu8Edge direction
min_depthu8Minimum number of hops (inclusive)
max_depthu8Maximum number of hops (inclusive)
track_pathboolWhether to record full traversal paths

Output columns:

ColumnTypeDescription
_startI64Starting node ID
_endI64Reached node ID
_depthI64Number of hops from start to end

shortest_path — BFS shortest path

Find the shortest path between source and destination nodes via BFS:

let src = g.const_i64(0)?;   // start node
let dst = g.const_i64(42)?;  // target node
let path = g.shortest_path(src, dst, &rel, 10)?;
let result = g.execute(path)?;
ParameterTypeDescription
srcColumnSource node ID(s)
dstColumnDestination node ID(s)
rel&RelCSR relationship index
max_depthu8Maximum search depth

Output columns:

ColumnTypeDescription
_nodeI64Node ID along the path
_depthI64Hop distance from the source

wco_join — Worst-case optimal join

Leapfrog Triejoin across multiple relationships. Finds all n-cliques or multi-way join results in worst-case optimal time:

// Find triangles: nodes (a, b, c) where a->b, b->c, a->c
let rel_ab = Rel::from_edges(&edges, "src", "dst", n, n, true)?;
let rel_bc = Rel::from_edges(&edges, "src", "dst", n, n, true)?;
let rel_ac = Rel::from_edges(&edges, "src", "dst", n, n, true)?;

let triangles = g.wco_join(&[&rel_ab, &rel_bc, &rel_ac], 3)?;
let result = g.execute(triangles)?;
ParameterTypeDescription
rels&[&Rel]Slice of relationship indexes to join
n_varsu8Number of join variables (output columns)

Output columns:

ColumnTypeDescription
_v0, _v1, ..., _vNI64Node IDs for each join variable

Constrained Traversal

There is no single "constrained traversal" primitive. Instead, you compose constraints by combining the Rel (structural), direction, depth bounds, and filter (predicate). All constraints are part of the same lazy DAG and are optimized together at execute() time.

Constraint types

ConstraintMechanismApplied
Which edges existRel::build / Rel::from_edgesBefore traversal (index construction)
Edge directiondirection parameterAt traversal time
Hop depthmin_depth / max_depthAt traversal (var_expand, shortest_path)
Source node propertiesfilter(col, predicate) before traversalPre-traversal
Result propertiesfilter(col, predicate) after traversalPost-traversal
Structural patternswco_join (Leapfrog Triejoin)Multi-way pattern matching

Pre-traversal filtering

Constrain which nodes start the traversal by filtering source IDs with a predicate before passing them to expand or var_expand:

let g = ctx.graph(&people)?;

// Scan columns from the people table
let id  = g.scan("id")?;
let age = g.scan("age")?;

// Constraint: only start from people older than 30
let threshold = g.const_i64(30)?;
let pred      = g.gt(age, threshold)?;
let old_ids   = g.filter(id, pred)?;

// Traverse 1 hop from only those constrained start nodes
let neighbors = g.expand(old_ids, &rel, TD_DIR_FWD)?;
let result    = g.execute(neighbors)?;

Multiple predicates can be composed with and / or:

let age  = g.scan("age")?;
let city = g.scan("city")?;

// age > 30 AND city = "NYC"
let age_pred  = g.gt(age, g.const_i64(30)?)?;
let city_pred = g.eq(city, g.const_str("NYC")?)?;
let combined  = g.and(age_pred, city_pred)?;
let start_ids = g.filter(g.scan("id")?, combined)?;

Depth-constrained traversal

Use min_depth and max_depth on var_expand to bound how far the BFS goes:

// Exactly 2 hops (friends-of-friends, excluding direct friends)
let fof = g.var_expand(start, &rel, TD_DIR_FWD, 2, 2, false)?;

// Reachable within 1–5 hops in either direction
let reachable = g.var_expand(start, &rel, TD_DIR_BOTH, 1, 5, false)?;

// Shortest path with a maximum depth of 10
let path = g.shortest_path(src, dst, &rel, 10)?;

Post-traversal filtering

Constrain traversal results by executing the traversal first, then building a new graph on the result table to apply property predicates on destination nodes:

// Step 1: execute the traversal
let g = ctx.graph(&people)?;
let start = g.scan("id")?;
let reached = g.var_expand(start, &rel, TD_DIR_FWD, 1, 3, false)?;
let traversal_result = g.execute(reached)?;
// traversal_result columns: _start, _end, _depth

// Step 2: build a new graph on the result to filter by depth
let g2 = ctx.graph(&traversal_result)?;
let depth = g2.scan("_depth")?;
let pred  = g2.le(depth, g2.const_i64(2)?)?;

// Project only the columns we need, filtered
let start_col = g2.filter(g2.scan("_start")?, pred)?;
let end_col   = g2.filter(g2.scan("_end")?, pred)?;
let depth_col = g2.filter(depth, pred)?;
let projected = g2.select(start_col, &[start_col, end_col, depth_col])?;
let filtered  = g2.execute(projected)?;

Enriching results with destination properties

After traversal, use add_table and scan_table to look up properties of destination nodes. Since _dst / _end values are row offsets into the target table, you can use them to index into that table's columns:

// After expand: result has _src, _dst (row offsets into people)
let g = ctx.graph(&people)?;
let start = g.scan("id")?;
let neighbors = g.expand(start, &rel, TD_DIR_FWD)?;
let result = g.execute(neighbors)?;

// Build a new graph to enrich _dst with people properties
let g2 = ctx.graph(&result)?;
let people_id = g2.add_table(&people);
let dst_name = g2.scan_table(people_id, "name")?;   // look up name
let dst_age  = g2.scan_table(people_id, "age")?;     // look up age

// Filter: only neighbors younger than 25
let pred = g2.lt(dst_age, g2.const_i64(25)?)?;
let dst  = g2.filter(g2.scan("_dst")?, pred)?;
let enriched = g2.execute(dst)?;

Combining all constraints

The full pattern: pre-filter source nodes, traverse with depth bounds, then post-filter results.

use teide::{Context, Rel};
use teide::ffi::TD_DIR_FWD;

let ctx = Context::new()?;
let people  = ctx.read_csv("people.csv")?;    // id, name, city, age
let follows = ctx.read_csv("follows.csv")?;   // src, dst
let n = people.nrows() as i64;

// Structural constraint: build the CSR index
let rel = Rel::from_edges(&follows, "src", "dst", n, n, true)?;

let g = ctx.graph(&people)?;
let id   = g.scan("id")?;
let city = g.scan("city")?;

// Pre-traversal constraint: start from NYC residents only
let nyc_pred = g.eq(city, g.const_str("NYC")?)?;
let nyc_ids  = g.filter(id, nyc_pred)?;

// Depth + direction constraints: 1–3 hops, forward only
let reached = g.var_expand(nyc_ids, &rel, TD_DIR_FWD, 1, 3, false)?;

// Execute the constrained traversal
let result = g.execute(reached)?;
// result columns: _start, _end, _depth

Tip: Because the graph is a lazy DAG, pre-traversal filters are composed into the same execution plan as the traversal itself. The optimizer can push predicates down, fuse operators, and eliminate dead code — so building the constraint in multiple steps has no performance penalty compared to a monolithic operation.

Complete Example

End-to-end social network scenario: load data, build a relationship index, find friends-of-friends reachable from users in a specific city, and read the results.

use teide::{Context, Rel};
use teide::ffi::TD_DIR_FWD;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let ctx = Context::new()?;

    // Load a people table and an edge table
    let people  = ctx.read_csv("people.csv")?;    // id, name, city, age
    let follows = ctx.read_csv("follows.csv")?;   // src, dst

    let n = people.nrows() as i64;

    // Build a CSR relationship index from the edge table
    let rel = Rel::from_edges(&follows, "src", "dst", n, n, true)?;

    // Optionally persist for reuse
    rel.save("/tmp/follows_rel")?;

    // Create a graph rooted on the people table
    let g = ctx.graph(&people)?;

    // Pre-traversal constraint: only start from NYC residents
    let id   = g.scan("id")?;
    let city = g.scan("city")?;
    let pred = g.eq(city, g.const_str("NYC")?)?;
    let nyc_ids = g.filter(id, pred)?;

    // Depth-constrained BFS: friends-of-friends (exactly 2 hops)
    let fof = g.var_expand(nyc_ids, &rel, TD_DIR_FWD, 2, 2, false)?;

    // Execute the computation graph
    let result = g.execute(fof)?;

    // Read the output columns
    let nrows = result.nrows();
    println!("Found {} friend-of-friend pairs from NYC", nrows);
    for i in 0..nrows.min(10) {
        let start = result.get_i64(0, i as usize).unwrap();
        let end   = result.get_i64(1, i as usize).unwrap();
        let depth = result.get_i64(2, i as usize).unwrap();
        println!("  {} -> {} (depth {})", start, end, depth);
    }

    Ok(())
}

API Reference Summary

Rel methods

MethodDescription
Rel::build(table, fk_col, n_target, sort)Build CSR from a foreign key column
Rel::from_edges(table, src, dst, n_src, n_dst, sort)Build CSR from an edge table
Rel::load(dir)Load CSR from disk
Rel::mmap(dir)Memory-map CSR from disk
rel.save(dir)Save CSR to disk

Graph traversal methods

MethodOutput columnsDescription
g.expand(src, rel, dir)_src, _dst1-hop neighbor expansion
g.var_expand(start, rel, dir, min, max, path)_start, _end, _depthVariable-length BFS
g.shortest_path(src, dst, rel, max)_node, _depthBFS shortest path
g.wco_join(rels, n_vars)_v0, _v1, ...Leapfrog Triejoin

Multi-table methods

MethodReturnsDescription
g.add_table(table)u16Register a secondary table
g.scan_table(id, col)ColumnScan a column from a registered table