How we turn a folder of audio and video into a searchable library — using best-in-class open models that run entirely on your Mac. No cloud, no API keys, no telemetry.
No articles match — try a different topic or clear your search.
The engineering of a local-first app — how media gets in, how a folder becomes a searchable index, how it stays fast at scale, and how it ships as a Mac app where everything actually works.
The job queue, the local store, and the content-hash check that keep a 3,000-file library correct without a backend.
ANN search with atomic index swaps, killing N+1 queries, cached facets, and incremental — not quadratic — face assignment.
Bounded workers, SQLite-backed job state, cooperative cancellation, stuck-job detection, and a self-throttling activity UI.
Three OS-specific bugs, a Tauri native shell, and the release pipeline that delivers real installers on all three platforms.
When lazy imports meet PyInstaller's tree-shaking, features quietly return empty. The bug class — and the CI test that ends it.
IR metrics, bootstrap confidence intervals and a permutation test, latency percentiles — and a CI gate that fails the build on a regression.
Parallel inspector roles, a risk-triggered escalation table, and a tight discovery-to-fix loop that surfaces this commit's bugs in under 30 minutes.
The on-device models that turn raw audio and video into something you can search — speech, people, and objects.
From ffmpeg decode to word-level timestamps — the speech-to-text pipeline that never sends a byte to the cloud.
Two dials most apps hide — model size and the engine that runs it — surfaced as a one-time picker, with a Metal + Neural-Engine path.
tiny…large-v3 as a plain-language decision: the accuracy, speed, RAM and disk trade-offs, and how to match one to your audio, language and Mac.
Speaker diarization and an opt-in face library that label your media without anything ever leaving the machine.
One zero-shot trick — CLIP plus a confidence gate and a bundled face gallery — finds brands, activities, and public figures you never tagged.
Keyless DSP that labels the musical stretches of your clips and clusters the ones sharing a track — no fingerprint API, nothing uploaded.
Finding the exact moment you need — by meaning, by question, or by browsing an auto-organized library.
Lower-thirds, slide titles, product labels, street signs — MediaFind reads every frame at index time so on-screen text is searchable in seconds, on your Mac.
Why “a rocket blasting off” finds the right clip even when nobody said those words — semantic text, visual, and OCR search combined.
Semantic search blurs exact names. A literal entity index nails them — keyless gazetteer first, then an optional NER model and Wikidata linking.
Said, shown, written, who's in it, what's happening, where, which brand, what sound — a tour of every search channel, which run by default, and how they fuse into one ranked list.
The same frame found by visual, OCR and logo shouldn't be three results. How a fusion key spots duplicates, why merging goes by rank not score, and how every channel's annotations survive.
Retrieve the few segments that matter, ground a local model on them, and answer with a link to the exact second — no upload.
An opt-in on-device model tier rephrases grounded answers into prose — GGUF weights, run with llama.cpp, with a keyless fallback if it's off.
What download size, RAM, speed and quality really cost — and a one-line rule for choosing the right model tier for your Mac.
Turn found moments into a frame-accurate FCPXML or EDL sequence — drop-frame timecode, markers, and a portable proxy bundle.
Tagging with no training data, a confidence gate that kills false positives, and a knowledge graph of your library.
The knowledge map links files two ways — by what they literally share, and by what their summaries mean. When each wins, and why both stay on-device.
Grouping segments into skimmable chapters and stitching a one-take reel — cut on word boundaries, never uploaded.
Scoring search & Ask on every commit with the standard IR metrics, confidence intervals, a quality-vs-latency dashboard, and a regression gate that fails the build.
The promises behind “on-device,” made concrete — what stays home, what you can verify, and how the one outbound path is hardened.
No accounts, no API keys, no telemetry — plus an audit that confirms the core path opens zero external sockets. A check, not a claim.
A tour of the local data folder — SQLite index, thumbnails, prefs — and the delete paths that let you wipe the sensitive parts yourself.
The one path that touches the internet, hardened: a connect-time IP guard, hostname-verified TLS, and a server that listens only on loopback.
How real editors and creators use MediaFind — cutting the time between shoot and cut.
Paste a link, get a searchable clip — yt-dlp for known sites, an HTML-scrape fallback for the rest, then the same private pipeline.
Scrubbing by eye, vague filenames, no face or voice search — the five habits that quietly eat your day, and a faster approach for each one.