Private by default — and a command that proves it

Most “private” apps are private in the sense that they promise not to misuse the data they collect. MediaFind is private in a stronger sense: for the core experience there's nothing to collect, nowhere to send it, and a way to confirm that for yourself. This post is about the boundary we drew, and the tooling that keeps it honest.

The threat model: what would leak, if anything could

Privacy engineering starts with naming the sensitive things. For a media library, the crown jewels are obvious:

Your media — the audio and video files themselves.
Derived content — transcripts, on-screen text (OCR), and frame thumbnails. A transcript can be more revealing than the recording.
Biometrics — voice prints and face embeddings (see the diarization & faces deep dive).
Your intent — the queries you type. Search history is a profile of what you're looking for.

Every one of these is computed and stored locally. None of them is the input to a network request on the core path. The design rule is blunt: media-derived data never crosses the process boundary to a server we run.

No accounts, no API keys, no telemetry

The cheapest way to keep data on-device is to build nothing that would move it off:

No sign-in. There's no account, so there's no server-side profile to populate.
No API keys. Transcription (Whisper), search (CLIP & text embeddings), and the optional answer model all run from bundled open weights. There is no cloud inference endpoint to call.
No telemetry. No analytics SDK, no crash pings phoned home by default, no “usage events.” The app does not open a background channel to report on you.

The only network activity in normal use is the stuff you explicitly ask for — checking for an app update, or pasting a link to download a video — and both are visible, opt-in actions, not ambient background traffic.

Everything that's derived from your media stays inside the boundary. The only things that cross are actions you take by hand — and the audit command sits on the socket layer to keep that true.

The part that makes it checkable: `mediafind audit`

A promise you can't inspect is just marketing. So MediaFind ships a self-check. Run the core indexing-and-search path and the audit confirms it opened no connections to the outside world:

$ mediafind audit
MediaFind privacy self-audit — core path network egress
========================================================
core steps exercised (4): index → search → ask → export
index backend: keyless  (keyless, on-device)
embed backend: sentence-transformers  (offline; falls back to keyless hashing without a cached model)

✓ PASS — core path opened 0 external sockets across 4 steps.

This isn't a label we print; it's an observation of what the process actually did. If a future change accidentally introduced a network call on the indexing or search path — a sneaky “helpful” lookup, a mis-scoped analytics line — the socket count would stop being zero and the check would say so. It's a regression test for the privacy promise itself.

Why a count, not a claim: the number that matters is sockets opened, because that's the thing an attacker, a curious user, or a packet sniffer can independently observe. Don't trust the copy on this page — run the audit, or watch the app at the network layer with Little Snitch or lsof. The honest answer holds up under inspection.

The guarantees, in one table

Decision	What it removes
No account / sign-in	No server-side identity or history
Bundled open models, no API keys	No cloud inference endpoint to leak to
No analytics or telemetry SDK	No ambient “usage” phone-home
Queries handled in-process	Your intent isn't logged off-device
`mediafind audit`	Turns all of the above into a check, not a claim

The one honest exception

There is exactly one way bytes leave on purpose: when you paste a web link to pull a video into your library. That's a deliberate outbound request to a site you chose — and because it accepts untrusted input, it gets its own hardening (SSRF guards, a loopback-only server). That story is its own post. Everything else stays home.

Don't trust the promise — run the check

Index a folder, search it, and audit it. The private path is the only path, and it's verifiable.

Download for macOS

Keep reading

Your library lives on your disk — and erases on your command · Data control Pulling video off the web, safely: SSRF guards & a loopback-only server · Security Who said it, who's in it — diarization & face recognition, privately · People & privacy