Skip to content

Commit

Permalink
feat: address review comments
Browse files Browse the repository at this point in the history
Signed-off-by: Manan Gupta <[email protected]>
  • Loading branch information
GuptaManan100 committed Jan 16, 2025
1 parent d60ab78 commit 5b29b94
Show file tree
Hide file tree
Showing 17 changed files with 40 additions and 41 deletions.
4 changes: 2 additions & 2 deletions changelog/22.0/22.0.0/summary.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,9 +102,9 @@ You can control idle connection retention for the query server’s query pool, s
This feature ensures that, during traffic spikes, idle connections are available for faster responses, while minimizing overhead in low-traffic periods by limiting the number of idle connections retained. It helps strike a balance between performance, efficiency, and cost.

### <a id="stall-disk-recovery"/>Stalled Disk Recovery in VTOrc</a>
VTOrc has been augmented to be able to identify and recover from stalled disk errors. This is done by polling that the disk is writable by the vttablets and they send this information in the full status output to VTOrc. If the disk is not writable on the primary tablet, VTOrc will attempt to recover the cluster by reparenting to a different primary. This is useful in scenarios where the disk is stalled and the primary vttablet is unable to accept writes because of it.
VTOrc can now identify and recover from stalled disk errors. VTTablets test whether the disk is writable and they send this information in the full status output to VTOrc. If the disk is not writable on the primary tablet, VTOrc will attempt to recover the cluster by promoting a new primary. This is useful in scenarios where the disk is stalled and the primary vttablet is unable to accept writes because of it.

To opt into this feature, `--enable-stalled-disk-primary-recovery` flag has to be specified on VTOrc, and `--stalled-disk-write-dir` flag has to be specified on the vttablets. `--stalled-disk-write-interval` and `--stalled-disk-write-timeout` flags can be used to configure the polling interval and timeout respectively.
To opt into this feature, `--enable-primary-disk-stalled-recovery` flag has to be specified on VTOrc, and `--disk-write-dir` flag has to be specified on the vttablets. `--disk-write-interval` and `--disk-write-timeout` flags can be used to configure the polling interval and timeout respectively.

## <a id="minor-changes"/>Minor Changes</a>

Expand Down
6 changes: 3 additions & 3 deletions go/flags/endtoend/vtcombo.txt
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,9 @@ Flags:
--ddl_strategy string Set default strategy for DDL statements. Override with @@ddl_strategy session variable (default "direct")
--default_tablet_type topodatapb.TabletType The default tablet type to set for queries, when one is not explicitly selected. (default PRIMARY)
--degraded_threshold duration replication lag after which a replica is considered degraded (default 30s)
--disk-write-dir string if provided, tablet will attempt to write a file to this directory to check if the disk is stalled
--disk-write-interval duration how often to write to the disk to check whether it is stalled (default 5s)
--disk-write-timeout duration if writes exceed this duration, the disk is considered stalled (default 30s)
--emit_stats If set, emit stats to push-based monitoring and stats backends
--enable-consolidator Synonym to -enable_consolidator (default true)
--enable-consolidator-replicas Synonym to -enable_consolidator_replicas
Expand Down Expand Up @@ -328,9 +331,6 @@ Flags:
--srv_topo_cache_refresh duration how frequently to refresh the topology for cached entries (default 1s)
--srv_topo_cache_ttl duration how long to use cached entries for topology (default 1s)
--srv_topo_timeout duration topo server timeout (default 5s)
--stalled-disk-write-dir string if provided, tablet will attempt to write a file to this directory to check if the disk is stalled
--stalled-disk-write-interval duration how often to write to the disk to check whether it is stalled (default 5s)
--stalled-disk-write-timeout duration if writes exceed this duration, the disk is considered stalled (default 30s)
--start_mysql Should vtcombo also start mysql
--stats_backend string The name of the registered push-based monitoring/stats backend to use
--stats_combine_dimensions string List of dimensions to be combined into a single "all" value in exported stats vars
Expand Down
2 changes: 1 addition & 1 deletion go/flags/endtoend/vtorc.txt
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ Flags:
--config-type string Config file type (omit to infer config type from file extension).
--consul_auth_static_file string JSON File to read the topos/tokens from.
--emit_stats If set, emit stats to push-based monitoring and stats backends
--enable-stalled-disk-primary-recovery Whether VTOrc should be analyzing and recovering stalled disk primary failures
--enable-primary-disk-stalled-recovery Whether VTOrc should detect a stalled disk on the primary and failover
--grpc-dial-concurrency-limit int Maximum concurrency of grpc dial operations. This should be less than the golang max thread limit of 10000. (default 1024)
--grpc_auth_static_client_creds string When using grpc_static_auth in the server, this file provides the credentials to use to authenticate with server.
--grpc_compression string Which protocol to use for compressing gRPC. Default: nothing. Supported: snappy
Expand Down
6 changes: 3 additions & 3 deletions go/flags/endtoend/vttablet.txt
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,9 @@ Flags:
--dba_idle_timeout duration Idle timeout for dba connections (default 1m0s)
--dba_pool_size int Size of the connection pool for dba connections (default 20)
--degraded_threshold duration replication lag after which a replica is considered degraded (default 30s)
--disk-write-dir string if provided, tablet will attempt to write a file to this directory to check if the disk is stalled
--disk-write-interval duration how often to write to the disk to check whether it is stalled (default 5s)
--disk-write-timeout duration if writes exceed this duration, the disk is considered stalled (default 30s)
--emit_stats If set, emit stats to push-based monitoring and stats backends
--enable-consolidator Synonym to -enable_consolidator (default true)
--enable-consolidator-replicas Synonym to -enable_consolidator_replicas
Expand Down Expand Up @@ -328,9 +331,6 @@ Flags:
--srv_topo_cache_refresh duration how frequently to refresh the topology for cached entries (default 1s)
--srv_topo_cache_ttl duration how long to use cached entries for topology (default 1s)
--srv_topo_timeout duration topo server timeout (default 5s)
--stalled-disk-write-dir string if provided, tablet will attempt to write a file to this directory to check if the disk is stalled
--stalled-disk-write-interval duration how often to write to the disk to check whether it is stalled (default 5s)
--stalled-disk-write-timeout duration if writes exceed this duration, the disk is considered stalled (default 30s)
--stats_backend string The name of the registered push-based monitoring/stats backend to use
--stats_combine_dimensions string List of dimensions to be combined into a single "all" value in exported stats vars
--stats_common_tags strings Comma-separated list of common tags for the stats backend. It provides both label and values. Example: label1:value1,label2:value2
Expand Down
12 changes: 6 additions & 6 deletions go/vt/vtorc/config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -175,10 +175,10 @@ var (
},
)

enableStalledDiskPrimaryRecovery = viperutil.Configure(
"enable-stalled-disk-primary-recovery",
enablePrimaryDiskStalledRecovery = viperutil.Configure(
"enable-primary-disk-stalled-recovery",
viperutil.Options[bool]{
FlagName: "enable-stalled-disk-primary-recovery",
FlagName: "enable-primary-disk-stalled-recovery",
Default: false,
Dynamic: true,
},
Expand Down Expand Up @@ -206,7 +206,7 @@ func registerFlags(fs *pflag.FlagSet) {
fs.Duration("recovery-poll-duration", recoveryPollDuration.Default(), "Timer duration on which VTOrc polls its database to run a recovery")
fs.Bool("allow-emergency-reparent", ersEnabled.Default(), "Whether VTOrc should be allowed to run emergency reparent operation when it detects a dead primary")
fs.Bool("change-tablets-with-errant-gtid-to-drained", convertTabletsWithErrantGTIDs.Default(), "Whether VTOrc should be changing the type of tablets with errant GTIDs to DRAINED")
fs.Bool("enable-stalled-disk-primary-recovery", enableStalledDiskPrimaryRecovery.Default(), "Whether VTOrc should be analyzing and recovering stalled disk primary failures")
fs.Bool("enable-primary-disk-stalled-recovery", enablePrimaryDiskStalledRecovery.Default(), "Whether VTOrc should detect a stalled disk on the primary and failover")

viperutil.BindFlags(fs,
instancePollTime,
Expand All @@ -224,7 +224,7 @@ func registerFlags(fs *pflag.FlagSet) {
recoveryPollDuration,
ersEnabled,
convertTabletsWithErrantGTIDs,
enableStalledDiskPrimaryRecovery,
enablePrimaryDiskStalledRecovery,
)
}

Expand Down Expand Up @@ -345,7 +345,7 @@ func SetConvertTabletWithErrantGTIDs(val bool) {

// GetStalledDiskPrimaryRecovery reports whether VTOrc is allowed to check for and recovery stalled disk problems.
func GetStalledDiskPrimaryRecovery() bool {
return enableStalledDiskPrimaryRecovery.Get()
return enablePrimaryDiskStalledRecovery.Get()
}

// MarkConfigurationLoaded is called once configuration has first been loaded.
Expand Down
2 changes: 1 addition & 1 deletion go/vt/vtorc/db/generate_base.go
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ CREATE TABLE database_instance (
semi_sync_primary_status TINYint NOT NULL DEFAULT 0,
semi_sync_replica_status TINYint NOT NULL DEFAULT 0,
semi_sync_primary_clients int NOT NULL DEFAULT 0,
stalled_disk TINYint NOT NULL DEFAULT 0,
is_disk_stalled TINYint NOT NULL DEFAULT 0,
PRIMARY KEY (alias)
)`,
`
Expand Down
4 changes: 2 additions & 2 deletions go/vt/vtorc/inst/analysis.go
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ const (
LockedSemiSyncPrimaryHypothesis AnalysisCode = "LockedSemiSyncPrimaryHypothesis"
LockedSemiSyncPrimary AnalysisCode = "LockedSemiSyncPrimary"
ErrantGTIDDetected AnalysisCode = "ErrantGTIDDetected"
StalledDiskPrimary AnalysisCode = "StalledDiskPrimary"
PrimaryDiskStalled AnalysisCode = "PrimaryDiskStalled"
)

type StructureAnalysisCode string
Expand Down Expand Up @@ -130,7 +130,7 @@ type ReplicationAnalysis struct {
MaxReplicaGTIDMode string
MaxReplicaGTIDErrant string
IsReadOnly bool
IsStalledDisk bool
IsDiskStalled bool
}

func (replicationAnalysis *ReplicationAnalysis) MarshalJSON() ([]byte, error) {
Expand Down
9 changes: 4 additions & 5 deletions go/vt/vtorc/inst/analysis_dao.go
Original file line number Diff line number Diff line change
Expand Up @@ -234,7 +234,7 @@ func GetReplicationAnalysis(keyspace string, shard string, hints *ReplicationAna
DISTINCT case when replica_instance.log_bin
AND replica_instance.log_replica_updates then replica_instance.major_version else NULL end
) AS count_distinct_logging_major_versions,
primary_instance.stalled_disk != 0 AS is_stalled_disk
primary_instance.is_disk_stalled != 0 AS is_disk_stalled
FROM
vitess_tablet
JOIN vitess_keyspace ON (
Expand Down Expand Up @@ -355,7 +355,7 @@ func GetReplicationAnalysis(keyspace string, shard string, hints *ReplicationAna
a.HeartbeatInterval = m.GetFloat64("heartbeat_interval")

a.IsReadOnly = m.GetUint("read_only") == 1
a.IsStalledDisk = m.GetBool("is_stalled_disk")
a.IsDiskStalled = m.GetBool("is_disk_stalled")

if !a.LastCheckValid {
analysisMessage := fmt.Sprintf("analysis: Alias: %+v, Keyspace: %+v, Shard: %+v, IsPrimary: %+v, LastCheckValid: %+v, LastCheckPartialSuccess: %+v, CountReplicas: %+v, CountValidReplicas: %+v, CountValidReplicatingReplicas: %+v, CountLaggingReplicas: %+v, CountDelayedReplicas: %+v",
Expand Down Expand Up @@ -403,11 +403,10 @@ func GetReplicationAnalysis(keyspace string, shard string, hints *ReplicationAna
} else if isInvalid {
a.Analysis = InvalidReplica
a.Description = "VTOrc hasn't been able to reach the replica even once since restart/shutdown"
} else if a.IsClusterPrimary && !a.LastCheckValid && a.IsStalledDisk {
a.Analysis = StalledDiskPrimary
} else if a.IsClusterPrimary && !a.LastCheckValid && a.IsDiskStalled {
a.Analysis = PrimaryDiskStalled
a.Description = "Primary has a stalled disk"
ca.hasClusterwideAction = true
//
} else if a.IsClusterPrimary && !a.LastCheckValid && a.CountReplicas == 0 {
a.Analysis = DeadPrimaryWithoutReplicas
a.Description = "Primary cannot be reached by vtorc and has no replica"
Expand Down
2 changes: 1 addition & 1 deletion go/vt/vtorc/inst/analysis_dao_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ func TestGetReplicationAnalysisDecision(t *testing.T) {
}},
keyspaceWanted: "ks",
shardWanted: "0",
codeWanted: StalledDiskPrimary,
codeWanted: PrimaryDiskStalled,
}, {
name: "DeadPrimary",
info: []*test.InfoForRecoveryAnalysis{{
Expand Down
4 changes: 2 additions & 2 deletions go/vt/vtorc/inst/instance_dao.go
Original file line number Diff line number Diff line change
Expand Up @@ -879,7 +879,7 @@ func mkInsertForInstances(instances []*Instance, instanceWasActuallyFound bool,
"semi_sync_primary_clients",
"semi_sync_replica_status",
"last_discovery_latency",
"stalled_disk",
"is_disk_stalled",
}

values := make([]string, len(columns))
Expand Down Expand Up @@ -1011,7 +1011,7 @@ func UpdateInstanceLastChecked(tabletAlias string, partialSuccess bool, stalledD
SET
last_checked = DATETIME('now'),
last_check_partial_success = ?,
stalled_disk = ?
is_disk_stalled = ?
WHERE
alias = ?
`,
Expand Down
10 changes: 5 additions & 5 deletions go/vt/vtorc/inst/instance_dao_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ func TestMkInsertSingle(t *testing.T) {
version, major_version, version_comment, binlog_server, read_only, binlog_format,
binlog_row_image, log_bin, log_replica_updates, binary_log_file, binary_log_pos, source_host, source_port, replica_net_timeout, heartbeat_interval,
replica_sql_running, replica_io_running, replication_sql_thread_state, replication_io_thread_state, has_replication_filters, supports_oracle_gtid, oracle_gtid, source_uuid, ancestry_uuid, executed_gtid_set, gtid_mode, gtid_purged, gtid_errant,
source_log_file, read_source_log_pos, relay_source_log_file, exec_source_log_pos, relay_log_file, relay_log_pos, last_sql_error, last_io_error, replication_lag_seconds, replica_lag_seconds, sql_delay, data_center, region, physical_environment, replication_depth, is_co_primary, has_replication_credentials, allow_tls, semi_sync_enforced, semi_sync_primary_enabled, semi_sync_primary_timeout, semi_sync_primary_wait_for_replica_count, semi_sync_replica_enabled, semi_sync_primary_status, semi_sync_primary_clients, semi_sync_replica_status, last_discovery_latency, stalled_disk, last_seen)
source_log_file, read_source_log_pos, relay_source_log_file, exec_source_log_pos, relay_log_file, relay_log_pos, last_sql_error, last_io_error, replication_lag_seconds, replica_lag_seconds, sql_delay, data_center, region, physical_environment, replication_depth, is_co_primary, has_replication_credentials, allow_tls, semi_sync_enforced, semi_sync_primary_enabled, semi_sync_primary_timeout, semi_sync_primary_wait_for_replica_count, semi_sync_replica_enabled, semi_sync_primary_status, semi_sync_primary_clients, semi_sync_replica_status, last_discovery_latency, is_disk_stalled, last_seen)
VALUES
(?, ?, ?, DATETIME('now'), DATETIME('now'), 1, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, DATETIME('now'))
`
Expand All @@ -87,7 +87,7 @@ func TestMkInsertThree(t *testing.T) {
version, major_version, version_comment, binlog_server, read_only, binlog_format,
binlog_row_image, log_bin, log_replica_updates, binary_log_file, binary_log_pos, source_host, source_port, replica_net_timeout, heartbeat_interval,
replica_sql_running, replica_io_running, replication_sql_thread_state, replication_io_thread_state, has_replication_filters, supports_oracle_gtid, oracle_gtid, source_uuid, ancestry_uuid, executed_gtid_set, gtid_mode, gtid_purged, gtid_errant,
source_log_file, read_source_log_pos, relay_source_log_file, exec_source_log_pos, relay_log_file, relay_log_pos, last_sql_error, last_io_error, replication_lag_seconds, replica_lag_seconds, sql_delay, data_center, region, physical_environment, replication_depth, is_co_primary, has_replication_credentials, allow_tls, semi_sync_enforced, semi_sync_primary_enabled, semi_sync_primary_timeout, semi_sync_primary_wait_for_replica_count, semi_sync_replica_enabled, semi_sync_primary_status, semi_sync_primary_clients, semi_sync_replica_status, last_discovery_latency, stalled_disk, last_seen)
source_log_file, read_source_log_pos, relay_source_log_file, exec_source_log_pos, relay_log_file, relay_log_pos, last_sql_error, last_io_error, replication_lag_seconds, replica_lag_seconds, sql_delay, data_center, region, physical_environment, replication_depth, is_co_primary, has_replication_credentials, allow_tls, semi_sync_enforced, semi_sync_primary_enabled, semi_sync_primary_timeout, semi_sync_primary_wait_for_replica_count, semi_sync_replica_enabled, semi_sync_primary_status, semi_sync_primary_clients, semi_sync_replica_status, last_discovery_latency, is_disk_stalled, last_seen)
VALUES
(?, ?, ?, DATETIME('now'), DATETIME('now'), 1, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, DATETIME('now')),
(?, ?, ?, DATETIME('now'), DATETIME('now'), 1, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, DATETIME('now')),
Expand Down Expand Up @@ -515,19 +515,19 @@ func TestUpdateInstanceLastChecked(t *testing.T) {
tabletAlias: "zone1-0000000100",
partialSuccess: false,
stalledDisk: false,
conditionToCheck: "last_checked >= DATETIME('now', '-30 second') and last_check_partial_success = false and stalled_disk = false",
conditionToCheck: "last_checked >= DATETIME('now', '-30 second') and last_check_partial_success = false and is_disk_stalled = false",
}, {
name: "Verify partial success",
tabletAlias: "zone1-0000000100",
partialSuccess: true,
stalledDisk: false,
conditionToCheck: "last_checked >= datetime('now', '-30 second') and last_check_partial_success = true and stalled_disk = false",
conditionToCheck: "last_checked >= datetime('now', '-30 second') and last_check_partial_success = true and is_disk_stalled = false",
}, {
name: "Verify stalled disk",
tabletAlias: "zone1-0000000100",
partialSuccess: false,
stalledDisk: true,
conditionToCheck: "last_checked >= DATETIME('now', '-30 second') and last_check_partial_success = false and stalled_disk = true",
conditionToCheck: "last_checked >= DATETIME('now', '-30 second') and last_check_partial_success = false and is_disk_stalled = true",
}, {
name: "Verify no error on unknown tablet",
tabletAlias: "unknown tablet",
Expand Down
Loading

0 comments on commit 5b29b94

Please sign in to comment.