Featured
Retrieval OS
A modular retrieval-augmented generation platform built to handle 10M+ tokens/day across heterogeneous document types. Designed with pluggable embedding backends, hybrid BM25+vector search, and a streaming inference layer. Deployed into 8 Fortune 500 accounts with <200ms P99 latency.
# Hybrid retrieval with re-ranking
retriever = HybridRetriever(
dense=EmbeddingIndex(model="text-embedding-3-large"),
sparse=BM25Index(corpus=documents),
reranker=CrossEncoderReranker(top_k=5),
)
results = await retriever.query(
text=query,
filters={"tenant_id": tenant},
alpha=0.7, # dense weight
)