Architecture

How a folder becomes a searchable index — and stays fresh

Transcription, embeddings, and faces are the glamorous parts. The unglamorous part — a job queue, a local store, and a size-and-modified-time fingerprint check — is what lets you point MediaFind at a folder of 3,000 files and trust the results an hour later.

The individual models in MediaFind are the easy headline. The hard, boring engineering is everything around them: how thousands of files get processed without melting your laptop, where the results live, and — the part most tools get wrong — how the index stays correct when you add, edit, or delete files. This post is about that machinery.

The shape of the system

At a high level MediaFind is a pipeline with three layers: an intake that discovers work, a per-file pipeline that does the expensive ML, and a local index that everything queries. A small web UI and the CLI sit on top, talking to the same index.

Watched folder roots you add scan + walk Job queue one job / file N workers · async per-file pipeline (parallel across cores) Decode · transcribe Embed segments Keyframes · CLIP · OCR Diarize speakers Faces (Pro) Local index SQLite metadata + vector blobs + keyframe cache ~/…/MediaFind Web UI CLI
Three layers: intake → job queue → per-file pipeline → local index. The UI and CLI are thin readers over the same on-disk store. Every box runs on your Mac; nothing in this diagram makes a network call.

The job queue: parallelism without the meltdown

Indexing is embarrassingly parallel across files but expensive per file. So intake turns each discovered file into a job, and a pool of workers drains the queue — bounded so we saturate your cores without swapping the machine to its knees. Jobs run asynchronously: the UI stays responsive and shows live progress while a backlog churns in the background.

A subtlety we learned the hard way: the background workers write to the real on-disk index, not to whatever object a request is holding. Conflating the two produced tests that passed against a mocked index while the actual job wrote elsewhere. The fix — and the rule — is that the index path is the single source of truth, and jobs always target it directly.

The storage layer: metadata, vectors, and frames

The index isn't one thing; it's a few stores that play to their strengths:

It all lives in one application-data directory you own. There is no cloud database and no embeddings API — a property you can verify, not just take on faith.

Staying fresh: the part most tools skip

A library is never static. You drop in new footage, re-export an edit, rename a folder. A naïve indexer either re-processes everything (wasteful) or trusts file paths (wrong the moment you edit a file in place). MediaFind keys on a fast (modified-time + size) fingerprint: if the file's size and modified-time are unchanged, the work is already done; if either moved, only that file is reprocessed.

files on disk a.mp4 12M·t0 b.mov 31M·t9 c.mkv 88M·t0 d.m4a new size+mtime == index? fingerprint, not path match → skip (a, c · done) changed → re-queue (b) new → queue (d) missing → prune 2 reindexed · 2 unchanged · consistent
Reindex is a diff, not a rebuild. Fingerprint each file by its size and modified-time, compare to the index, and act only on the delta: skip unchanged, reprocess changed, ingest new, prune deleted. A 3,000-file library with three new clips costs three jobs, not three thousand.

This is also why a stale state can appear: if embeddings were built by an older model, or a file changed while a worker was busy, the index flags those entries and offers a one-click refresh rather than silently serving outdated results. Honest about what it knows, explicit about what needs redoing.

Why build it this way

Every architectural choice here bends toward the same two goals: scale on a laptop and never phone home. A bounded job queue keeps a big library tractable on consumer hardware. A file-based index keeps the whole thing portable and serverless. A cheap size-and-modified-time fingerprint keeps it correct over months of edits. None of it requires — or permits — a backend.


With a fresh, queryable index in place, MediaFind can do more than find things — it can organize them. Next up: zero-shot categories and a knowledge map that turns your library into a browsable graph.

Index a real library and see

Point it at a folder and watch it work — locally, with live progress. Free trial.

Download for macOS