QiCapture
Overview
QiCapture is the data intake and ingestion pipeline of QiOS. It handles messy life data (exports, receipts, documents) and turns it into clean, structured, and searchable materials.Responsibilities
- Clean and parse raw intake.
- Index and extract metadata from documents (via OCR or scripts).
- Chunk and embed clean text for semantic search.
- Form entity relationships and graphs from text.
- Synchronize indices and manage backups.
Flows
Folder Structure & Table of Contents
Intake Pipeline Stages
- 10 Ingestion: Raw intake from various sources (Paperless, drive imports, manual uploads, email exports, web clips).
- 20 Extraction: OCR processing, text extraction, markdown exports, and transcripts.
- 30 Chunking: Processing clean text into chunks, manifest records, and source mapping.
- 40 Embeddings: Embedding models, Qdrant setup, collections, and vector quality tests.
- 50 Graphs: Entity extraction, relationship graphs, and entity maps.
- 60 Retrieval: Semantic search profiles, query logs, retrieval tests, and context packaging.
- 70 Memory: facts storage, session summaries, user preference maps.
- 80 Sync & Backups: Snapshot logs, restore points, and synchronization manifests.
Tools & System References
- NocoDB: Intake tracking database interface.
- Obsidian QiDocs: Vault indexing rules and guidelines for local notes.
- WikiJS: Ingest documentation library.
- Paperless-ngx: Document management setup and indexing rules.

