Indexing and Sharing Organizational Context with qmd

Fri Apr 17 2026

Data engineering is mostly context gathering. Tons of moments of going though code, docs, specs, issues, conversations to figure out how stakeholders want active users to be counted. qmd turns that pile of documents into something you can actually query without worrying about setup, which makes it useful for both me and my agents. I wanted to share that indexed knowledge with colleagues without forcing them to run CLI commands like qmd embed.

I ended up with a setup I like: the index is managed declaratively in git, and a daily CI job publishes a SQLite database with the embeddings.

Making It Declarative

I wrote a tiny ~10 line wrapper so that if the current folder has a .qmd/ directory, qmd points to that local config and index. Within a project, qmd only sees that project’s index and collections.

That also makes index.yml easy to version control. Instead of living in my machine and memory, the index definition lives in the repo.

Making It Shared

Once I wanted to share this with the team, I had two obvious options.

This works for technical folks but is not smooth for everyone else (even in the age of clankers, you need to know the right spells!). Consumers need the source code, need to run qmd update, need to run qmd embed, and need to keep everything fresh over time. Not great.

2. Host an MCP server

The “correct and boring” option. qmd supports MCP already, and you can even host MCP servers for free with Hugging Face Spaces.

I decided not to go this route. I prefer my infrastructure local-friendly and I don’t like managing servers.

Making It Accessible

Instead, I distribute a prebuilt index, similar to how I distribute datasets. I put together the filecoin-docs-qmd repo to package a curated Filecoin docs index and publish it.

The flow is simple:

A curated list of sources lives on GitHub.
Every day, a GitHub Action runs qmd update and qmd embed, compresses the SQLite DB, and publishes it.
Users run a small installation script that downloads the latest database (<50MB) and sets up qmd.

After that, anyone can do:

qmd --index filecoin query "how does lotus handle deal onboarding"

A good middle ground:

Source of truth is declarative and version controlled.
The artifact is self-contained and easy to install.
UX stays local and CLI-friendly.
No MCP forced on anyone.

Making It Smooth

Something I have not gotten to yet: forking qmd to support remote indexes. The entire process could be simplified to:

qmd --index https://awesomedomain.io/qmd.sqlite query "how do we count active users"