Private by default — and a command that proves it
“We respect your privacy” is the most over-claimed sentence in software. MediaFind takes the opposite approach: the private path is the only path, and you don't have to take our word for it — there's a command that checks.
Most “private” apps are private in the sense that they promise not to misuse the data they collect. MediaFind is private in a stronger sense: for the core experience there's nothing to collect, nowhere to send it, and a way to confirm that for yourself. This post is about the boundary we drew, and the tooling that keeps it honest.
The threat model: what would leak, if anything could
Privacy engineering starts with naming the sensitive things. For a media library, the crown jewels are obvious:
- Your media — the audio and video files themselves.
- Derived content — transcripts, on-screen text (OCR), and frame thumbnails. A transcript can be more revealing than the recording.
- Biometrics — voice prints and face embeddings (see the diarization & faces deep dive).
- Your intent — the queries you type. Search history is a profile of what you're looking for.
Every one of these is computed and stored locally. None of them is the input to a network request on the core path. The design rule is blunt: media-derived data never crosses the process boundary to a server we run.
No accounts, no API keys, no telemetry
The cheapest way to keep data on-device is to build nothing that would move it off:
- No sign-in. There's no account, so there's no server-side profile to populate.
- No API keys. Transcription (Whisper), search (CLIP & text embeddings), and the optional answer model all run from bundled open weights. There is no cloud inference endpoint to call.
- No telemetry. No analytics SDK, no crash pings phoned home by default, no “usage events.” The app does not open a background channel to report on you.
The only network activity in normal use is the stuff you explicitly ask for — checking for an app update, or pasting a link to download a video — and both are visible, opt-in actions, not ambient background traffic.
The part that makes it checkable: mediafind audit
A promise you can't inspect is just marketing. So MediaFind ships a self-check. Run the core indexing-and-search path and the audit confirms it opened no connections to the outside world:
$ mediafind audit
MediaFind privacy self-audit — core path network egress
========================================================
core steps exercised (4): index → search → ask → export
index backend: keyless (keyless, on-device)
embed backend: sentence-transformers (offline; falls back to keyless hashing without a cached model)
✓ PASS — core path opened 0 external sockets across 4 steps.
This isn't a label we print; it's an observation of what the process actually did. If a future change accidentally introduced a network call on the indexing or search path — a sneaky “helpful” lookup, a mis-scoped analytics line — the socket count would stop being zero and the check would say so. It's a regression test for the privacy promise itself.
lsof. The honest answer holds up under inspection.
The guarantees, in one table
| Decision | What it removes |
|---|---|
| No account / sign-in | No server-side identity or history |
| Bundled open models, no API keys | No cloud inference endpoint to leak to |
| No analytics or telemetry SDK | No ambient “usage” phone-home |
| Queries handled in-process | Your intent isn't logged off-device |
mediafind audit | Turns all of the above into a check, not a claim |
The one honest exception
There is exactly one way bytes leave on purpose: when you paste a web link to pull a video into your library. That's a deliberate outbound request to a site you chose — and because it accepts untrusted input, it gets its own hardening (SSRF guards, a loopback-only server). That story is its own post. Everything else stays home.
Don't trust the promise — run the check
Index a folder, search it, and audit it. The private path is the only path, and it's verifiable.
Download for macOS