ColBERT

Late-interaction retrieval that goes beyond single-vector similarity. Instead of comparing one query vector to one document vector, ColBERT independently encodes every query token and every document token, then computes fine-grained relevance by matching each query token against all document tokens — delivering precision that coarse embeddings cannot match.

How It Works

Encode

Query and document tokens are independently encoded into dense vectors. Each token gets its own representation — no information is collapsed into a single embedding.

Interact

For each query token, MaxSim computes the maximum cosine similarity against every document token. The most relevant document token wins — capturing the best possible alignment per query term.

Score

All per-token MaxSim scores are summed into a final document relevance score. Documents ranked by this sum produce results that are measurably more precise than single-vector methods.

Advantages

Token Precision

Matching happens at the word level, not just the document level. A query for MAX_RETRIES finds documents containing that exact identifier, not just broadly related content.

Pre-computation

Document token embeddings are computed offline and stored. At query time only the query tokens need encoding, keeping latency low even over large corpora.

Interpretability

Token-level scores reveal exactly which query tokens matched which document tokens. Useful for debugging retrieval quality and understanding why a result ranked where it did.

When to Use

Precise Queries

When exact term matching matters as much as semantic meaning. ColBERT respects both — surface form and contextual meaning are preserved at the token level.

Legal / Medical

Domain-specific terminology requires exact matching. proximate cause and contributing cause are semantically close but legally distinct — ColBERT keeps them separate.

Code Search

Variable names, function signatures, and exact patterns demand token-level precision. Searching for ingest_observation_to_brain should find that function, not approximate synonyms.

Access

API Documentation

ColBERT retrieval is available on paid tiers.