At a Glance:
TiDB is an open-source, cloud-native, distributed SQL database that combines horizontal scalability, strong consistency via distributed transactions, and hybrid transactional/analytical processing (HTAP) with MySQL 8.0 compatibility.
Overview:
TiDB is a distributed SQL database designed to serve as a highly available, scalable, and strongly consistent data platform. It addresses the challenges of scaling traditional MySQL-based applications by offering a compatible interface while separating compute and storage for independent scaling. The system supports both row-based online transaction processing (OLTP) through TiKV and columnar analytical processing (OLAP) through TiFlash, making it suitable for users who need converged transactional and real-time analytical workloads. It is deployable across public cloud, on-premises, and Kubernetes environments, and is accompanied by data migration tools for moving existing applications from MySQL.
Key Decision Points:
Workload suitability: Designed for hybrid transactional/analytical processing (HTAP), with dedicated row-based (TiKV) and columnar (TiFlash) storage engines, allowing optimization for both OLTP and OLAP on consistent data.
Scalability model: Separates compute and storage, enabling independent horizontal scaling by adding nodes or vertical scaling by increasing node resources without downtime.
MySQL ecosystem integration: Supports MySQL 8.0 protocols and syntax, allowing users to migrate applications with minimal or no code changes and use existing MySQL tools.
High availability design: Implements the Raft consensus protocol to manage multiple data replicas, providing automated failover and ensuring transactions are committed only after writing to a majority of replicas.
Deployment options: Can be deployed on public clouds, on-premises, or natively in Kubernetes, with TiDB Operator for cluster management and TiDB Cloud available as a fully-managed service.
Core Features:
Distributed transaction support: Uses a two-phase commit protocol across nodes to enforce ACID compliance and strong consistency, with correctness guarantees during network partitions or node failures.
HTAP with real-time replication: Combines TiKV for row-based transactional storage and TiFlash for columnar analytical processing, with TiFlash replicating data in real-time via the Multi-Raft Learner protocol.
Raft-based high availability: Ensures data durability and automated failover by storing data in multiple replicas and requiring majority-write consensus before transaction commitment.
MySQL 8.0 compatibility: Allows use of familiar MySQL protocols, frameworks, and tools, with a suite of data migration tools provided for application data migration.
Kubernetes-native operation: Offers TiDB Operator for automated cluster operations on Kubernetes, alongside a fully-managed cloud service for simplified provisioning.
Use Cases:
Developers migrating from MySQL who need to scale a current application horizontally without rewriting application code or changing familiar database protocols.
Data teams running hybrid workloads that require strong transactional consistency alongside real-time analytical queries on the same operational data set.
Site reliability engineers deploying high-availability databases across cloud or on-premises environments, using automated failover and configurable geographic replica placement for disaster tolerance.
Open-Source Alternative Value:
TiDB is made available under the Apache 2.0 license with its source code published on GitHub, including what the project describes as enterprise-grade features. As an open-source project, it offers a database built on distributed systems principles—separating compute and storage and using the Raft protocol for consistency—without restricting these capabilities behind a proprietary license. The project’s commitment to open development means users can inspect the architecture, participate in its community on platforms like Discord and Slack, and contribute back, though the value derived from this openness depends on an organization’s ability to operate and manage a distributed database system.




