Two ways to relate files: shared signals vs. semantic relatedness

Open the knowledge map and you get a graph of your library: each recording a node, lines between the ones that go together. But “go together” isn't one thing. A flip in the corner of the map switches between two definitions of related — shared signals and semantic relatedness — and they're built from completely different machinery. Knowing which is which turns the map from a pretty picture into a tool.

Shared signals: “these files literally have something in common”

The default mode is lexical. Two files are linked when they share a concrete, already-extracted attribute — and the edge can tell you exactly which one. MediaFind looks at five channels:

Category — both files fall under the same high-level category.
Keyword — both surface the same salient word from their summaries (with generic, everywhere-words pruned so “the” doesn't link your whole library).
Speaker — the same diarized voice appears in both.
Person — the same clustered face appears in both (opt-in).
Entity — both mention the same named person, place, or brand in the transcript or on-screen text.

Each shared item adds to the edge weight, and the channels aren't equal: a shared entity counts most, a shared speaker or person next, a shared category less, and a shared keyword least. Two files that share a named brand and a speaker draw a heavier line than two that merely land in the same category. Because every edge carries the items that created it, the map can say why: “linked by Acme, Berlin, and Alice.” It's set overlap — exact, auditable, and impossible to hallucinate.

Semantic relatedness: “these files mean the same thing”

Flip the toggle and the rules change entirely. Now an edge appears when two files' summary embeddings are close in vector space — cosine similarity above a threshold. This reuses the per-file summary vectors MediaFind already stored during indexing; it embeds nothing new and downloads nothing at graph-build time. Run connected-components over the resulting graph and you get rough topic clusters for free.

The payoff is the thing lexical overlap can't do: it links files that are about the same subject even when they share no words. A clip that talks about “quarterly earnings” and one about “the Q3 numbers” have no keyword, category, speaker, or entity in common — the lexical map leaves them unconnected. In embedding space they sit right next to each other, so the semantic map draws the line.

The same two files, two verdicts. Shared signals finds no overlapping category, keyword, speaker, or entity — so no line. Semantic relatedness sees their summary vectors sit close together and draws the edge. Different words, same meaning.

Which one is right?

Neither — they answer different questions, and the gap between them is the point.

Shared signals is precise and explainable. An edge means a fact you can name and verify. It will never invent a connection. But it's blind to paraphrase: say it a different way and the link disappears.
Semantic relatedness bridges vocabulary. It connects topics across different words and surfaces clusters you didn't know were there. The cost: it can only tell you “the summaries are similar,” not exactly why — and occasionally it calls two things close that you'd consider unrelated.

A useful rule of thumb: reach for shared signals when you want to trace concrete threads — every recording a person was in, everything that mentions a brand. Reach for semantic relatedness when you're exploring — “what else is about this?” — and don't yet know the right keyword to ask for.

The same cosine, reused. Semantic relatedness isn't a one-off. The exact normalization, dimension-grouping, and similarity threshold that draw the map's semantic edges also power the “more like this” button — seed it with one file and get its closest neighbors. They share code on purpose, so “related” on the map and “related” in search agree edge-for-edge.

Both, on your Mac, from data you already have

The thing the two modes do share is the part that matters most: neither leaves the device, and neither needs new compute. Lexical edges read the categories, keywords, speakers, faces, and entities MediaFind already extracted. Semantic edges read the summary embeddings it already stored. No model is downloaded, no vector is computed, and not one byte is uploaded when you build either map. It's two questions asked of data you already own — and the answers stay yours.

So next time the knowledge map looks a little sparse, or a little surprising, check the toggle. You might just be asking it the other question.

See your library both ways

The knowledge map and “more like this” come from the same on-device indexing. Free trial.

Download for macOS

Keep reading

Auto-organizing a messy library: zero-shot categories & a knowledge map · Organization Search by meaning: embeddings, CLIP and a local vector index · Search Find every mention of a name: keyless entity search, then open-vocab NER · Search