One binary instead of eight systems

Building a temporal + graph + vector + keyword search backend for AI usually means assembling many independently-operated systems — a vector database, a graph database, a keyword index, an embedding server, a reranker, an orchestration layer, a cache and an API/MCP server. Each is a separate cluster to run, secure, version and keep in sync. GreyCat unifies all of them into a single self-contained binary — so you ship faster, secure one surface and stop paying the keep-everything-in-sync tax.

Count the moving parts

Here is the typical reference stack for retrieval-augmented AI, mapped to its GreyCat equivalent.

The polyglot stack ~8 services to run GreyCat 1 binary
Vector database (e.g. Qdrant / Pinecone / pgvector) Built-in VectorIndex
Graph database (e.g. Neo4j) Native graph (typed nodes + dot-notation traversal)
Keyword / BM25 index (e.g. Elasticsearch) text_search — BM25 / hybrid
Embedding server On-device embeddings (llama.cpp, in-process)
Reranker service Hybrid RRF fusion, built in
Orchestration (LangChain / LangGraph) One search() endpoint / one query model
Cache + separate API server + separate MCP server One greycat serve (REST + OpenAPI + MCP + static web)
8 → 1

Around 8 services — each a separate failure domain, a separate thing to secure and a separate sync point — collapse into 1 self-contained binary.

Capabilities, side by side

On every dimension that matters for an AI backend, one unified engine beats eight stitched-together systems.

Dimension Polyglot stack GreyCat
Data models Separate stores, one per shape of data Unified time-series + graph + geospatial + vector + full-text in one engine
Point-in-time / temporal queries Usually bolted on per store, if available at all Native — time is a first-class dimension across the whole graph
Hybrid keyword + vector search Two engines plus a reranker to fuse them One index — BM25, vector and hybrid (RRF) in a single call
AI / MCP Bolt-on — separate embedding service, orchestration layer and MCP server Built in — on-device embeddings and a native MCP server in the same binary
Deployment footprint Multiple clusters and services to provision A single ~4.6 MB binary
Operations Keep N stores in sync; multi-hop queries fan out across systems One store, one import, one endpoint — no cross-system sync
Data sovereignty Varies — managed services and external embedding APIs may move data off-site Fully self-hosted with on-device AI; built in the EU (Luxembourg)
Cost Several licenses, multiple large clusters and managed-service / cloud bills ≈8× cheaper to operate — one free binary on hardware you already run

Coming from a specific tool?

vs Neo4j

GreyCat gives you a native graph with typed nodes and dot-notation traversal — and adds native time-series, vector and full-text search inside the same engine and transaction. No stitching Neo4j to a separate time-series, vector and keyword store, and no Cypher to learn.

vs InfluxDB / TimescaleDB

If you came for time-series, GreyCat handles temporal data as a first-class dimension — and then lets you model relationships as a graph, run vector search and run full-text search over the same data, without standing up extra systems.

vs Pinecone / pgvector

GreyCat's VectorIndex keeps your vectors in the same store as the rest of your data, and adds graph relationships, temporal queries and on-device embeddings — so vectors stop being a separate service with its own sync job and external embedding API.

vs Elasticsearch

GreyCat's text_search brings BM25/BM25F, fuzzy, phrase, proximity and hybrid (keyword + vector) search into one binary, alongside the graph and temporal data it describes — one engine to run, secure and keep in sync instead of a separate search cluster.

Top