Manticore Search

Open source search database delivering 2.83x faster performance than Elasticsearch. Features vector search, SQL interface, and full-text capabilities.

At a Glance:

Manticore Search is an open-source SQL-first search database designed as a fast alternative to Elasticsearch, offering full-text and vector search with row-wise and columnar storage options.

Overview:

Manticore Search is an open-source database built specifically for search workloads, forked from Sphinx in 2017. It positions itself as a high-performance alternative to Elasticsearch, with reproducible benchmarks showing significant speed advantages across various data sizes and workloads. The database uses SQL as its native syntax with MySQL protocol compatibility, while also providing an HTTP JSON interface with Elasticsearch-compatible write support. Manticore can handle full-text search, vector search, and hybrid search combining both approaches. It supports multiple storage engines including row-wise storage for faster performance, columnar storage via the Manticore Columnar Library for datasets too large for RAM, and docstore for key-value retrieval without RAM requirements. Data can be synced from MySQL, PostgreSQL, ODBC, XML, CSV, MS SQL, and Kafka. Built-in synchronous multi-master replication uses the Galera library with load balancing capabilities. Manticore is used in production by Craigslist, Socialgist, PubChem, and Rozetka among others.

Key Decision Points:

SQL-first with MySQL protocol: Uses SQL as its native syntax and works with existing MySQL clients, which may simplify adoption for teams already familiar with MySQL tooling.
Multiple storage engines available: Offers row-wise storage requiring more RAM but providing faster performance, columnar storage for datasets that exceed RAM capacity, and docstore for key-value access without RAM usage.
Hybrid search in a single query: Combines full-text retrieval and vector search within one query rather than requiring separate systems or complex merging logic.
Real-time index updates: Newly added or updated documents become immediately searchable without waiting for index rebuilds or batch processing cycles.
Not fully ACID-compliant: Supports isolated transactions and binary logging for write safety, but the lack of full ACID compliance may be relevant for workloads requiring strict transactional guarantees.
Built-in replication and load balancing: Uses synchronous multi-master replication through the Galera library with load balancing, supporting data distribution across servers and data centers.

Core Features:

Full-text search with over 20 operators: Supports a wide range of full-text query operators and over 20 ranking factors with custom ranking configuration.
Hybrid search: Combines full-text and vector retrieval in a single query for relevance tuning across both text and embedding-based signals.
Columnar storage via Manticore Columnar Library: Handles datasets too large to fit in RAM by storing data in columnar format with lower memory requirements.
Cost-based query optimizer: Uses statistical data about indexed values to automatically determine the most efficient query execution plan.
Synchronous multi-master replication: Built-in replication using the Galera library supports data distribution across servers with load balancing.
Data sync from external sources: Can pull data from MySQL, PostgreSQL, ODBC, XML, CSV, MS SQL, and Kafka for indexing.

Use Cases:

Full-text search for applications: Developers can use Manticore as the search backend for applications requiring faceted search, fuzzy search, geo-spatial search, autocomplete, and spelling correction.
Log analytics: Manticore's columnar storage and ingestion performance make it suitable for log search and analytics workloads, with documented benchmarks against Elasticsearch for this use case.
Stream filtering: Through percolate tables or Kafka integration, Manticore can filter streaming data in real time, matching incoming documents against stored queries.
Hybrid search applications: Combines full-text and vector search in one system for use cases like semantic product search or content discovery requiring both keyword and embedding-based relevance.

Open-Source Alternative Value:

Manticore Search provides an open-source search database that positions itself as an alternative to Elasticsearch, with reproducible benchmarks demonstrating performance advantages in throughput and query speed. It is built in C++ and starts with minimal resource usage, around 40MB RSS for an empty instance. Users can deploy and manage it without proprietary licensing constraints, with SQL-first access and MySQL protocol compatibility reducing the learning curve for teams already using MySQL tooling. The project offers multiple storage engines including row-wise, columnar, and docstore, allowing users to choose the storage approach that best fits their data size and performance requirements.

TeilenX LinkedIn Reddit