Component Deep Dive: src/page_handler.rs
PageHandler orchestrates page retrieval across caches and disk. It is the central service that operations use to obtain Pages in uncompressed form, optimizing for cache hits and minimizing lock contention.
Source Layout
src/page_handler.rs
9 pub struct PageHandler {
page_io: Arc<PageIO>,
uncompressed_page_cache: Arc<RwLock<PageCache<PageCacheEntryUncompressed>>>,
compressed_page_cache : Arc<RwLock<PageCache<PageCacheEntryCompressed>>>,
compressor: Arc<Compressor>,
}
16 impl PageHandler {
17 fn fetch_from_upc(&self, id) -> Result<Arc<PageCacheEntryUncompressed>, &str>
26 fn decompress_from_cpc(&self, id) -> Result<(), &str>
42 fn fetch_from_fs(&self, id, path, offset) -> Result<(), &str>
50 pub fn get_page(&self, page_meta: PageMetadata) -> Option<Arc<PageCacheEntryUncompressed>>
77 pub fn get_pages(&self, page_metas: Vec<PageMetadata>) -> Vec<Arc<PageCacheEntryUncompressed>>
}
Component Diagram
┌─────────────────────────┐
│ PageHandler │
│ ┌─────────────────────┐ │
│ │ Uncompressed Cache │◄┼───┐
│ └─────────────────────┘ │ │
│ ┌─────────────────────┐ │ │
│ │ Compressed Cache │◄┼───┤
│ └─────────────────────┘ │ │
│ ┌─────────────────────┐ │ │
│ │ Compressor │◄┼───┤
│ └─────────────────────┘ │ │
│ ┌─────────────────────┐ │ │
│ │ PageIO │◄┼───┘
│ └─────────────────────┘ │
└─────────────────────────┘
Inputs are PageMetadata (from TableMetaStore); outputs are Arc<PageCacheEntryUncompressed> handles suitable for direct reading/mutation.
Single Page Retrieval (get_page)
Input: PageMetadata { id, disk_path, offset }
1) UPC attempt:
- read-lock UPC
- if entry exists -> clone Arc & return
2) CPC attempt:
- read-lock CPC
- if entry exists -> clone Arc(blob)
- drop lock
- decompress(blob) via Compressor
- write-lock UPC, add page
- read-lock UPC again to return Arc
3) Disk fallback:
- PageIO::read_from_path(disk_path, offset) -> compressed entry
- write-lock CPC, insert entry
- decompress_from_cpc(id)
- final UPC read-lock to retrieve Arc
4) If any step fails, return None.
ASCII Sequence Diagram
PageHandler UPC CPC Compressor PageIO
│ │ │ │ │
│--read lock------│ │ │ │
│ miss │ │ │ │
│ │--read lock------│ │ │
│ │ miss │ │ │
│ │ │--read lock-------│ │
│ │ │ miss │ │
│ │ │--write lock---------------------------│
│ │ │ read_from_path(path,off) │
│ │ │<------------------------------------- │
│ │ │--insert compressed blob-------------->│
│ │ │--release lock │
│ │ │--read lock--------------------------- │
│ │ │ hit │
│ │ │--clone Arc(blob) │
│ │ │--drop lock │
│ │ │--► decompress(blob) ◄---------------- │
│ │--write lock-----│ │ │
│ │ add page │ │ │
│ │--drop lock------│ │ │
│--read lock------│ │ │ │
│ hit │ │ │ │
│--return Arc--------------------------------------------------------------→
Batch Retrieval (get_pages)
The batch method aggressively minimizes lock contention and preserves original page order.
Inputs: Vec<PageMetadata> (order preserved)
order := Vec<PageId>
meta_map := HashMap<PageId, PageMetadata>
result := Vec<Arc<PageCacheEntryUncompressed>>
already_pushed := HashSet<PageId>
1) UPC Sweep (read lock):
For each id in order:
if UPC has id:
push Arc into result
mark id in already_pushed
remove id from meta_map
2) CPC Sweep (read lock):
Collect (id, Arc<PageCacheEntryCompressed>) for remaining ids.
Remove hits from meta_map.
3) Decompress outside of locks:
For each (id, blob):
decompress blob -> UPC add (write lock per page)
4) Disk Fetch for leftovers:
For each meta in meta_map:
fetch_from_fs(id, path, offset) // populates CPC
decompress_from_cpc(id) // populates UPC
5) Final UPC read lock:
For each id in order not already pushed, read UPC and append to result.
ASCII Flow
order: ["p1","p2","p3"]
┌─────────────┐ read-lock ┌─────────────────────┐
│ UPC store │──────────►│ hits = {"p1"} │
└─────────────┘ └─────────────────────┘
┌─────────────┐ read-lock ┌─────────────────────┐
│ CPC store │──────────►│ hits = {"p2": blob} │
└─────────────┘ └─────────────────────┘
decompress blob("p2") → UPC write-lock add
remaining meta_map = {"p3": PageMetadata}
fetch_from_fs("p3") → CPC add
decompress_from_cpc("p3") → UPC add
Final UPC read-lock → collect ["p1","p2","p3"] in order
Concurrency Strategy
- Locks are scoped to the minimal region accessing shared state:
- UPC and CPC read locks are held only during lookups.
- Decompression and CPU-heavy work occurs outside locking regions.
- UPC writes happen one page at a time, keeping lock durations short.
Arcclones allow the caller to hold onto pages without keeping caches locked.
Error Handling & TODOs
fetch_from_fsanddecompress_from_cpcreturnResult, butget_pagetreats errors as cache misses and tries subsequent layers. Fatal errors propagate asNone.- Future enhancements:
- Distinguish between “not found” and “I/O error” to inform callers.
- Batch decompression writes by staging multiple additions inside a single lock guard when contention becomes an issue.
- Integrate background prefetch and writeback once the scheduler is wired in.