From Index to Oracle

Every era of information technology has changed what it means to find something. Not just how fast or how conveniently, but what ‘finding’ is, what becomes findable, and who decides.

  ยท  4 min read

Every era of information technology has changed what it means to find something.

Not just how fast or how conveniently, but what “finding” is, what becomes findable, and who decides.

The index at the back of a book is author-curated access. Someone anticipated what you might want and gave you entry points. It assumes you might read non-linearly, jumping to what matters. But you’re limited to what the author thought worth indexing. If they didn’t think “loneliness” was a theme, there’s no entry for it, even if the book is saturated with it.

The card catalog introduced multiple addresses for the same object. One book, findable via author, title, or subject. This was quietly radical, a physical object no longer had one location in conceptual space. But it required standardization. Librarians assigned subject headings, which meant certain framings became official and others invisible. To find something, you needed to know at least one of its sanctioned names.

The hyperlink made connections navigable. Finding became traversal, you enter a document and follow threads outward. The author still controls what links where, but now the structure between things is part of the territory. Vannevar Bush imagined trails of association; the web partially built them.

Search engines abstracted away the address entirely. You describe what you want; the system guesses where it lives. Finding became negotiation between your query and the algorithm’s relevance model. You no longer needed to know entry points. But the model’s notion of relevance became the gatekeeper, and gaming it became an industry.

Now, generative retrieval.

Every previous system assumed the thing you’re finding exists as a stored object โ€” a book, a page, a document. You’re locating it. But a language model can synthesize on demand. It can show you a view that was never explicitly written: the tensions in your notes, the pattern across twelve documents, the answer to a question none of your sources directly addressed.

The question shifts from “where is it” to “what do I want to see.”

You’re no longer constrained to what was pre-indexed or pre-linked. The system can surface adjacencies you never noticed. It can respond to the question you’re actually asking rather than the keywords you happened to use.

But there’s a strange new problem. When you locate a document, you can verify it exists. When you synthesize an answer, the line between discovery and invention blurs.

Did you find that insight in your notes, or did the model construct it? Does it matter?

The authorship of “finding” has always been distributed. The index was authored by the writer. The catalog by librarians. The hyperlink by page creators. Search rankings by algorithms trained on collective behavior. Generative retrieval tangles you, the model, and your sources into something harder to separate.

Maybe this is fine. Maybe finding was always a creative act dressed up as retrieval. When you followed a chain of links down a rabbit hole at 2am, the path you took was yours, an improvisation on the material.

Generative systems just make the improvisation explicit.

The search engineer’s job used to be building indexes and ranking relevance. Now it might be something stranger: designing systems that synthesize without fabricating, that interpolate without overwriting, that help you find what you almost knew.

In a recent episode of on Cheeky Pint, Tobi talked about Shopify’s obsession towards inventing more in the search space to power vibe-based search for agentic commerce. He envisions a future where search agents and LLMs understand “unquantifiables” like taste and can be proactive rather than just reactive:

“I just want the LLM to be proactive and just tell me, ‘Hey, this outfit would look good on you and here’s a preview.’” “It’s hard to get the data and the embeddings and the sort of understanding about, you know, like, we have to figure out… The most important thing again in the AI space is also about unquantifiables, It’s the vibes, people call them.”

An index waits. You query it, it returns matches, you figure out what it means. An oracle is different, it reads context, guesses what you actually need, and just tells you.

Glean does this for me at work. I’ll ask about some decision made six months ago, or a process I’ve never touched, and it pulls from Slack threads, docs, tickets, stuff I’d never find manually, and gives me a coherent answer. It knows things I forgot we knew. Feels like cheating, honestly.

But oracles are weird. They’re confident even when they’re wrong. The system decides what’s relevant, what to surface, what to ignore. You trust it because it usually works, but you can’t really check its reasoning.

So now we’re in the business of building systems that synthesize without making stuff up. That personalize without being creepy. That anticipate what you want without overstepping.

The search box used to be a question you typed. Now it’s closer to a conversation you’re having. We’re still figuring out what that means.

It’s a remarkable time to be thinking about search. The problem isn’t solved, it’s been reopened.