Skip to main content

QiCapture

Overview

QiCapture is the data intake and ingestion pipeline of QiOS. It handles messy life data (exports, receipts, documents) and turns it into clean, structured, and searchable materials.

Responsibilities

  • Clean and parse raw intake.
  • Index and extract metadata from documents (via OCR or scripts).
  • Chunk and embed clean text for semantic search.
  • Form entity relationships and graphs from text.
  • Synchronize indices and manage backups.

Flows

Intake raw file 
  -> profile & inspect 
  -> clean & normalize 
  -> produce staging import 
  -> Supabase import

Folder Structure & Table of Contents

Intake Pipeline Stages

  • 10 Ingestion: Raw intake from various sources (Paperless, drive imports, manual uploads, email exports, web clips).
  • 20 Extraction: OCR processing, text extraction, markdown exports, and transcripts.
  • 30 Chunking: Processing clean text into chunks, manifest records, and source mapping.
  • 40 Embeddings: Embedding models, Qdrant setup, collections, and vector quality tests.
  • 50 Graphs: Entity extraction, relationship graphs, and entity maps.
  • 60 Retrieval: Semantic search profiles, query logs, retrieval tests, and context packaging.
  • 70 Memory: facts storage, session summaries, user preference maps.
  • 80 Sync & Backups: Snapshot logs, restore points, and synchronization manifests.

Tools & System References

  • NocoDB: Intake tracking database interface.
  • Obsidian QiDocs: Vault indexing rules and guidelines for local notes.
  • WikiJS: Ingest documentation library.
  • Paperless-ngx: Document management setup and indexing rules.
Last modified on June 16, 2026