You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
CockroachDB stores data in ranges. Each range is a Raft consensus group composed of a certain number of replicas, each replica on a different node. The default replica count is 5 for system ranges (ranges that contain data about the CockroachDB database cluster) and 3 for data ranges. A range is functional if and only if a strict majority of its replicas are available ("quorum is met"). If a majority of any range's replicas permanently fail, that range is permanently lost unless a forensic recovery technique is applied to restore data from one of the minority replicas.
The default deployment configuration for a DSS pool is a minimum of 3 DSS instances, each DSS instance with 3 CRDB nodes (though this may change). We want to guarantee survival of the pool and all data in the pool when a minority of DSS instances are lost. The default configuration of 3 replicas that may be freely assigned to any node in the cluster does not achieve this objective as 2-3 of of those replicas may reside on the same DSS instance (thus causing a loss of quorum if that DSS instance goes down).
If we do not configure CRDB to ensure that every DSS instance receives a replica of every range, then we must increase the number of replicas for all ranges (system and data) to 7. Or, more generally, to survive the loss of a minority of DSS instances with N DSS instances, the number of replicas for all ranges must be set to 2 * 3 * floor(N / 2) + 1. Alternately, we could configure CRDB replication so that every DSS instance stores exactly one replica of each range.
The text was updated successfully, but these errors were encountered:
CockroachDB stores data in ranges. Each range is a Raft consensus group composed of a certain number of replicas, each replica on a different node. The default replica count is 5 for system ranges (ranges that contain data about the CockroachDB database cluster) and 3 for data ranges. A range is functional if and only if a strict majority of its replicas are available ("quorum is met"). If a majority of any range's replicas permanently fail, that range is permanently lost unless a forensic recovery technique is applied to restore data from one of the minority replicas.
The default deployment configuration for a DSS pool is a minimum of 3 DSS instances, each DSS instance with 3 CRDB nodes (though this may change). We want to guarantee survival of the pool and all data in the pool when a minority of DSS instances are lost. The default configuration of 3 replicas that may be freely assigned to any node in the cluster does not achieve this objective as 2-3 of of those replicas may reside on the same DSS instance (thus causing a loss of quorum if that DSS instance goes down).
If we do not configure CRDB to ensure that every DSS instance receives a replica of every range, then we must increase the number of replicas for all ranges (system and data) to 7. Or, more generally, to survive the loss of a minority of DSS instances with N DSS instances, the number of replicas for all ranges must be set to 2 * 3 * floor(N / 2) + 1. Alternately, we could configure CRDB replication so that every DSS instance stores exactly one replica of each range.
The text was updated successfully, but these errors were encountered: