Component Deep Dive: src/page.rs

Pages are the fundamental storage blocks manipulated by the engine. Each page groups a sequence of columnar Entry values and acts as the unit of caching, compression, and IO.

Source Snapshot

src/page.rs
 4  #[derive(Serialize, Deserialize, Clone)]
 5  pub struct Page { page_metadata, entries }
10  impl Page { pub fn new() -> Self { … } }
18              pub fn add_entry(&mut self, entry: Entry) { … }

Conceptual Model

┌─────────────────────────────── Page ───────────────────────────────┐
│ page_metadata : String                                            │
│ entries       : Vec<Entry>                                        │
│                  ┌──────────────┬──────────────┬──────────────┐    │
│                  │ Entry[0]     │ Entry[1]     │ Entry[2]     │ …  │
│                  └──────────────┴──────────────┴──────────────┘    │
└────────────────────────────────────────────────────────────────────┘
  • page_metadata: Placeholder for page-level attributes (compression stats, min/max values, etc.). Currently an empty string; earmarked for future optimizations.
  • entries: Dense, ordered list of Entry instances. The index within the vector corresponds to the logical row offset within the page.

Responsibilities

  1. Serialization Unit
    Pages are Serde-serializable, enabling bincode to convert them into byte streams. These byte streams feed directly into lz4_flex compression via the Compressor.

  2. Cache Atom
    Both the Uncompressed Page Cache (UPC) and Compressed Page Cache (CPC) store entire Pages (either as structs or blobs). All cache operations manipulate pages by ID, not individual entries.

  3. Mutation Surface
    Page::add_entry pushes new data onto the page. Updates occur by taking a clone of the Page (Arc clone from caches), mutating the entries vector, and writing back into the UPC.

Lifecycle

Page::new()
  │
  │-- Entry appended via add_entry()
  │
  ├─ stored in UPC (hot, uncompressed)
  │
  ├─ on eviction → Compressor::compress(Page) → CPC (cold, compressed)
  │
  └─ flushed to disk via PageIO::write_to_path (64B metadata + blob)

ASCII Example

Logical rows (column "temperature")

Index   Entry.data   Stored inside Page.entries
-----   ----------   ---------------------------------------
0       "21.5"       entries[0] = Entry { data: "21.5", … }
1       "22.0"       entries[1] = Entry { data: "22.0", … }
2       "22.6"       entries[2] = Entry { data: "22.6", … }

Interaction Matrix

┌─────────────────┬─────────────────────────────────────────────────┐
│ Component       │ Interaction                                     │
├─────────────────┼─────────────────────────────────────────────────┤
│ ops_handler     │ Clones & mutates Pages for upsert/update ops    │
│ PageCache       │ Stores Pages (UPC) & compressed blobs (CPC)     │
│ Compressor      │ (De)serializes Page -> Vec<u8> for LZ4          │
│ PageIO          │ Reads/writes compressed Page blobs on disk      │
│ TableMetaStore  │ Tracks where a Page lives (via PageMetadata)    │
└─────────────────┴─────────────────────────────────────────────────┘

Future Work Hooks

  • Implement in-place split/merge logic once page capacity is defined.
  • Populate page_metadata with min/max stats to accelerate range pruning.
  • Introduce dirty tracking to avoid blind re-serialization when no changes occurred.