SQL REFERENCE

Table Functions

Load CSV files, splayed tables, and partitioned tables directly in SQL queries.

read_csv()

Load a CSV file into memory by path:

-- Query a CSV file directly
SELECT * FROM read_csv('trades.csv');

-- Create a persistent in-memory table from CSV
CREATE TABLE trades AS SELECT * FROM read_csv('/data/trades.csv');

-- Query the loaded table
SELECT symbol, SUM(qty) AS total
FROM trades
GROUP BY symbol
ORDER BY total DESC;

Auto type inference: Teide samples the CSV to determine column types automatically. Supported inferred types: INTEGER, BIGINT, REAL, BOOLEAN, VARCHAR, DATE, TIMESTAMP.

The Rust API provides finer control with read_csv_opts:

let ctx = Context::new()?;

// Tab-separated, with header row, auto-infer types
let table = ctx.read_csv_opts("data.tsv", '\t', true, None)?;

// Pipe-separated, no header, explicit column types
let types = &[teide::types::SYM, teide::types::I64, teide::types::F64];
let table = ctx.read_csv_opts("data.psv", '|', false, Some(types))?;

read_splayed()

Open a splayed table from disk using zero-copy memory mapping. A splayed table stores each column as a separate binary file, enabling fast columnar access without loading the entire dataset into memory.

-- Open a splayed table directory
SELECT * FROM read_splayed('/data/tables/trades');

-- With an explicit symbol file
SELECT * FROM read_splayed('/data/tables/trades', '/data/tables/sym');

-- Create an in-memory table from a splayed source
CREATE TABLE trades AS SELECT * FROM read_splayed('/data/tables/trades');

On-disk layout

/data/tables/trades/
  .d           -- schema: column name symbol IDs
  price        -- column file (binary f64 vector)
  qty          -- column file (binary i64 vector)
  symbol       -- column file (binary sym vector)
  ts           -- column file (binary timestamp vector)

The .d file contains the column names as symbol IDs. Column files are raw binary vectors with a 32-byte block header followed by the data. String columns use symbol encoding — integer IDs that reference the global symbol table.

Zero-copy: Column files are memory-mapped (mmap), so the OS pages data in on demand. This means you can query tables larger than available RAM.

See Storage & Symbols for the full on-disk format specification.

read_parted()

Open a date-partitioned table. Partitioned tables split data across directories named by date (or integer key), with a shared symbol table at the database root.

-- Open a partitioned table
SELECT * FROM read_parted('/data/marketdb', 'trades');

-- The virtual partition column 'date' is queryable
SELECT date, symbol, SUM(qty) AS total
FROM read_parted('/data/marketdb', 'trades')
GROUP BY date, symbol
ORDER BY date, total DESC;

On-disk layout

/data/marketdb/
  sym                          -- shared symbol table
  2024.01.15/
    trades/
      .d                       -- schema
      price                    -- column file
      qty                      -- column file
      symbol                   -- column file
  2024.01.16/
    trades/
      .d
      price
      qty
      symbol
  2024.01.17/
    trades/
      ...

Teide automatically discovers all partition directories, loads the shared symbol table from db_root/sym, and builds a virtual partition-key column (the date). Partition directories must be named as YYYY.MM.DD dates or integer keys.

Caching: Partitioned tables are cached after the first open. Subsequent calls to read_parted() with the same path return instantly.

See Storage & Symbols for details on the partition discovery algorithm, MAPCOMMON virtual columns, and the sym file format.