Table Functions
Load CSV files, splayed tables, and partitioned tables directly in SQL queries.
read_csv()
Load a CSV file into memory by path:
-- Query a CSV file directly
SELECT * FROM read_csv('trades.csv');
-- Create a persistent in-memory table from CSV
CREATE TABLE trades AS SELECT * FROM read_csv('/data/trades.csv');
-- Query the loaded table
SELECT symbol, SUM(qty) AS total
FROM trades
GROUP BY symbol
ORDER BY total DESC;
Auto type inference: Teide samples the CSV to determine column types automatically. Supported inferred types: INTEGER, BIGINT, REAL, BOOLEAN, VARCHAR, DATE, TIMESTAMP.
The Rust API provides finer control with read_csv_opts:
let ctx = Context::new()?;
// Tab-separated, with header row, auto-infer types
let table = ctx.read_csv_opts("data.tsv", '\t', true, None)?;
// Pipe-separated, no header, explicit column types
let types = &[teide::types::SYM, teide::types::I64, teide::types::F64];
let table = ctx.read_csv_opts("data.psv", '|', false, Some(types))?;
read_splayed()
Open a splayed table from disk using zero-copy memory mapping. A splayed table stores each column as a separate binary file, enabling fast columnar access without loading the entire dataset into memory.
-- Open a splayed table directory
SELECT * FROM read_splayed('/data/tables/trades');
-- With an explicit symbol file
SELECT * FROM read_splayed('/data/tables/trades', '/data/tables/sym');
-- Create an in-memory table from a splayed source
CREATE TABLE trades AS SELECT * FROM read_splayed('/data/tables/trades');
On-disk layout
/data/tables/trades/
.d -- schema: column name symbol IDs
price -- column file (binary f64 vector)
qty -- column file (binary i64 vector)
symbol -- column file (binary sym vector)
ts -- column file (binary timestamp vector)
The .d file contains the column names as symbol IDs. Column files are raw binary vectors with a 32-byte block header followed by the data. String columns use symbol encoding — integer IDs that reference the global symbol table.
Zero-copy: Column files are memory-mapped (mmap), so the OS pages data in on demand. This means you can query tables larger than available RAM.
See Storage & Symbols for the full on-disk format specification.
read_parted()
Open a date-partitioned table. Partitioned tables split data across directories named by date (or integer key), with a shared symbol table at the database root.
-- Open a partitioned table
SELECT * FROM read_parted('/data/marketdb', 'trades');
-- The virtual partition column 'date' is queryable
SELECT date, symbol, SUM(qty) AS total
FROM read_parted('/data/marketdb', 'trades')
GROUP BY date, symbol
ORDER BY date, total DESC;
On-disk layout
/data/marketdb/
sym -- shared symbol table
2024.01.15/
trades/
.d -- schema
price -- column file
qty -- column file
symbol -- column file
2024.01.16/
trades/
.d
price
qty
symbol
2024.01.17/
trades/
...
Teide automatically discovers all partition directories, loads the shared symbol table from db_root/sym, and builds a virtual partition-key column (the date). Partition directories must be named as YYYY.MM.DD dates or integer keys.
Caching: Partitioned tables are cached after the first open. Subsequent calls to read_parted() with the same path return instantly.
See Storage & Symbols for details on the partition discovery algorithm, MAPCOMMON virtual columns, and the sym file format.