-
Notifications
You must be signed in to change notification settings - Fork 982
ProxySQL Cluster
Note: ProxySQL Cluster features are subject to changes because not all the features in the roadmap have been implemented yet.
ProxySQL is a decentralized proxy, and it is normally advised to deploy it close to the application.
This approach seems to scale pretty well to hundred(s) of nodes, as it was designed to be easily reconfigurable at runtime.
This allows to coordinate and reconfigure a farm of ProxySQL instances using a configuration management software like Ansible/Chef/Puppet/Salt (in alphabetical order), or a service discovery software like Etcd/Consul/ZooKeeper.
This allows ProxySQL to be highly customizable, and be adopted in any setup using any of these technologies, or even re-configurable by home-made tools.
Although it has some drawback too:
- it requires and relies on external software (configuration management software itself)
- the previous point means also that such approach it is not natively supported
- converge time it is not predictable
- there is no protection against network split
For this reason, ProxySQL 1.4.x attempts to support clustering natively. As pointed already, these features are subject to changes as not all the features in the roadmap are implemented yet.
Currently there are two main components in the ProxySQL clustering solution:
- monitoring
- re-configuration
Both components (monitoring and remote reconfiguration) are available for 4 tables:
mysql_query_rules
mysql_servers
mysql_users
proxysql_servers
More will be added in future, see roadmap.
To support Cluster Monitoring, several new tables, commands and variables were introduced.
Several new variables were added related to the Cluster solution. They are all Admin's variables, that means that to load them the command LOAD ADMIN VARIABLES TO RUNTIME
is needed.
Variables that define what to sync:
-
admin-checksum_mysql_query_rules
: boolean variable. Whentrue
(default) ProxySQL generates a new configuration checksum every timeLOAD MYSQL QUERY RULES TO RUNTIME
is executed. If set tofalse
, the new configuration isn't automatically propagated, neither is synced from a remote node; -
admin-checksum_mysql_servers
: boolean variable. Whentrue
(default) ProxySQL generates a new configuration checksum every timeLOAD MYSQL SERVERS TO RUNTIME
is executed. If set tofalse
, the new configuration isn't automatically propagated, neither is synced from a remote node; -
admin-checksum_mysql_users
: boolean variable. Whentrue
(default) ProxySQL generates a new configuration checksum every timeLOAD MYSQL USERS TO RUNTIME
is executed. If set tofalse
, the new configuration isn't automatically propagated, neither is synced from a remote node. If you have millions of users, disable this feature and do not rely on it, as it may be very slow;
Variables that define credentials:
-
admin-cluster_username
andadmin-cluster_password
: to monitor other proxysql instances this credential is used. Note that the pair username/password should also be present inadmin-admin_credentials
, or connection will fail. Ifadmin-cluster_username
is not defined, Clustering doesn't perform any check;
Variables that define checks interval/frequency:
-
admin-cluster_check_interval_ms
: this variable defines the interval between checksums checks.
default: 1000. Min: 10 , Max: 300000 -
admin-cluster_check_status_frequency
: if greater than 0, this variable defines after how many checksums checks a status check is performed.
default: 10. Min: 0 , Max: 10000
After a remote sync, it is normally a good idea to immediately save to disk the new change. In this way, after a restart the configuration will be already in sync. Variables related to sync to disk:
-
admin-cluster_mysql_query_rules_save_to_disk
: boolean variable. Whentrue
(default) after a remote sync and load to runtime, the new mysql query rules are also saved to disk; -
admin-cluster_mysql_servers_save_to_disk
: boolean variable. Whentrue
(default) after a remote sync and load to runtime, the new mysql servers are also saved to disk; -
admin-cluster_mysql_users_save_to_disk
: boolean variable. Whentrue
(default) after a remote sync and load to runtime, the new mysql users are also saved to disk; -
admin-cluster_proxysql_servers_save_to_disk
: boolean variable. Whentrue
(default) after a remote sync and load to runtime, the new proxysql servers are also saved to disk;
It is possible that multiple ProxSQL instances are being reconfigured at the same time, for different reasons.
For example it is possible that each ProxySQL instance is monitoring a MySQL replication topology and automatically detecting a failover, and within a short period of time (probably less than a second) they will all converge to the same configuration without the need of synchronize with each other.
Similarly, it is possible that a temporary network issue or a slow MySQL instance is detected by all proxies that will automatically shun the node. All proxies will take the same action without the need to synchronize with each other.
Or, as a final example, if a slave is lagging and automatically shunned because of replication lag, all proxies will take the same action independently from each other.
For this reason, ProxySQL Cluster can be configured to not synchronize immediately with a remote node, but to wait a certain number of checks before triggering a remote synchronization. If after such threshold the local and the remote configuration is still different, a synchronization is triggered:
-
admin-cluster_mysql_query_rules_diffs_before_sync
: defines how many mismatching checks triggers the synchronization ofmysql_query_rules
default: 3. Min: 0 (never sync). Max: 1000 -
admin-cluster_mysql_servers_diffs_before_sync
: defines how many mismatching checks triggers the synchronization ofmysql_servers
default: 3. Min: 0 (never sync). Max: 1000 -
admin-cluster_mysql_users_diffs_before_sync
: defines how many mismatching checks triggers the synchronization ofmysql_users
default: 3. Min: 0 (never sync). Max: 1000 -
admin-cluster_proxysql_servers_diffs_before_sync
: defines how many mismatching checks triggers the synchronization ofproxysql_servers
default: 3. Min: 0 (never sync). Max: 1000
Table definition:
CREATE TABLE proxysql_servers (
hostname VARCHAR NOT NULL,
port INT NOT NULL DEFAULT 6032,
weight INT CHECK (weight >= 0) NOT NULL DEFAULT 0,
comment VARCHAR NOT NULL DEFAULT '',
PRIMARY KEY (hostname, port) )
This table is a configuration table, and defines a list of ProxySQL peers.
-
hostname
: peer's hostname/IP -
port
: peer's port -
weight
: currently unused, but in the roadmap for future enhancements -
comment
: free form comment field
Entries for proxysql_servers
can be loaded form configuration file.
Below an example of how to configure proxysql_servers
from config file:
proxysql_servers =
(
{
hostname="172.16.0.101"
port=6032
weight=0
comment="proxysql1"
},
{
hostname="172.16.0.102"
port=6032
weight=0
comment="proxysql2"
}
)
Note: ProxySQL reads from config file only if the database file doesn't exist, or if executed with --initial
- configuration file doesn't support this table yet!
- because this feature is still experimental, the table is not automatically loaded from disk
Table definition:
CREATE TABLE runtime_proxysql_servers (
hostname VARCHAR NOT NULL,
port INT NOT NULL DEFAULT 6032,
weight INT CHECK (weight >= 0) NOT NULL DEFAULT 0,
comment VARCHAR NOT NULL DEFAULT '',
PRIMARY KEY (hostname, port) )
Like other runtime_
tables, this is a runtime representation of the base table: proxysql_servers
.
Table definition:
CREATE TABLE runtime_checksums_values (
name VARCHAR NOT NULL,
version INT NOT NULL,
epoch INT NOT NULL,
checksum VARCHAR NOT NULL,
PRIMARY KEY (name))
Table runtime_checksums_values
is the first runtime_
table that is not the runtime representation of a base table. Table runtime_checksums_values
shows information of when a LOAD TO RUNTIME
command was executed:
-
name
: name of the module -
version
: how many timesLOAD TO RUNTIME
was executed, either explicitly or not (executed internal due to some other event) -
epoch
: timestamp of whenLOAD TO RUNTIME
was executed -
checksum
: the checksum of the internal memory structure resulting fromLOAD TO RUNTIME
Example:
Admin> SELECT * FROM runtime_checksums_values;
+-------------------+---------+------------+--------------------+
| name | version | epoch | checksum |
+-------------------+---------+------------+--------------------+
| admin_variables | 0 | 0 | |
| mysql_query_rules | 5 | 1503442167 | 0xD3BD702F8E759B1E |
| mysql_servers | 1 | 1503440533 | 0x6F8CEF0F4BD6456E |
| mysql_users | 1 | 1503440533 | 0xF8BDF26C65A70AC5 |
| mysql_variables | 0 | 0 | |
| proxysql_servers | 2 | 1503442214 | 0x89768E27E4931C87 |
+-------------------+---------+------------+--------------------+
6 rows in set (0,00 sec)
Note:
Only 4 of the 6 modules generate a checksum for now.
-
LOAD MYSQL QUERY RULES TO RUNTIME
: generates a new checksum ifadmin-checksum_mysql_query_rules
is true -
LOAD MYSQL SERVERS TO RUNTIME
: generates a new checksum ifadmin-checksum_mysql_servers
is true -
LOAD MYSQL USERS TO RUNTIME
: generates a new checksum ifadmin-checksum_mysql_users
is true -
LOAD PROXYSQL SERVERS TO RUNTIME
: generates a new checksum, always -
LOAD ADMIN VARIABLES TO RUNTIME
: does NOT generate a checksum yet -
LOAD MYSQL VARIABLES TO RUNTIME
: does NOT generate a checksum yet
-
LOAD PROXYSQL SERVERS FROM MEMORY
/LOAD PROXYSQL SERVERS TO RUNTIME
loads ProxySQL servers from the in-memory database to the runtime data structures -
SAVE PROXYSQL SERVERS TO MEMORY
/SAVE PROXYSQL SERVERS FROM RUNTIME
persists the ProxySQL Servers from the runtime data structures to the in-memory database -
LOAD PROXYSQL SERVERS TO MEMORY
/LOAD PROXYSQL SERVERS FROM DISK
loads ProxySQL Servers from the on-disk database to the in-memory database -
LOAD PROXYSQL SERVERS FROM CONFIG
loads ProxySQL Servers from configuration file to the in-memory database -
SAVE PROXYSQL SERVERS FROM MEMORY
/SAVE PROXYSQL SERVERS TO DISK
persists the ProxySQL Servers from the in-memory database to the on-disk database
3 new tables were added in stats
schema
Table definition:
Admin> SHOW CREATE TABLE stats.stats_proxysql_servers_checksums\G
*************************** 1. row ***************************
table: stats_proxysql_servers_checksums
Create Table: CREATE TABLE stats_proxysql_servers_checksums (
hostname VARCHAR NOT NULL,
port INT NOT NULL DEFAULT 6032,
name VARCHAR NOT NULL,
version INT NOT NULL,
epoch INT NOT NULL,
checksum VARCHAR NOT NULL,
changed_at INT NOT NULL,
updated_at INT NOT NULL,
diff_check INT NOT NULL,
PRIMARY KEY (hostname, port, name) )
1 row in set (0,00 sec)
This table shows the checksum of other proxies, and their status:
-
hostname
: hostname of the peer -
port
: port of the peer -
name
: name of the module as reported in peer'sruntime_checksums_values
-
version
: version of checksum's module as reported in peer'sruntime_checksums_values
.
Note that a ProxySQL instance just started will haveversion=1
: for this reason, a ProxySQL instance will never sync from another instance havingversion=1
, because it is unlikely that a ProxyQL instance just started is the source of truth. This prevent a new joining node to corrupt current Cluster configuration. -
epoch
: epoch of the checksum's module as reported in peer'sruntime_checksums_values
-
checksum
: the checksum's module as reported in peer'sruntime_checksums_values
-
changed_at
: the timestamp of when a checksum change was detected -
updated_at
: the timestamp of when this entry was last refreshed -
diff_check
: a counter that defines for how many checks the checksum of the remote peer's was different than the local checksum. The reconfiguration algorithm will wait a threshold to be reached before triggering a reconfiguration.
This is useful in case the same configuration is applied to multiple proxies at the same time, or when proxies are reconfiguring themselves in case of a failover and they will likely converge without the need of resync.
See also variablescluster_*_diffs_before_sync
Ifdiff_check
increases a lot without triggering a synchronization it means that the remote peer is not a reliable source of truth, for example ifversion=1
.
On the other hand, it the remote peer doesn't sync with the rest of the cluster it means that the cluster doesn't have a reliable source of truth. This happen when all the proxies in a cluster starts with a different configuration, and they can't automatically decide which is the correct configuration. RunningLOAD module TO RUNTIME
on one of the node will automatically "elect" it to become the source of truth for that specific module.
Table definition:
Admin> SHOW CREATE TABLE stats.stats_proxysql_servers_metrics\G
*************************** 1. row ***************************
table: stats_proxysql_servers_metrics
Create Table: CREATE TABLE stats_proxysql_servers_metrics (
hostname VARCHAR NOT NULL,
port INT NOT NULL DEFAULT 6032,
weight INT CHECK (weight >= 0) NOT NULL DEFAULT 0,
comment VARCHAR NOT NULL DEFAULT '',
response_time_ms INT NOT NULL,
Uptime_s INT NOT NULL,
last_check_ms INT NOT NULL,
Queries INT NOT NULL,
Client_Connections_connected INT NOT NULL,
Client_Connections_created INT NOT NULL,
PRIMARY KEY (hostname, port) )
1 row in set (0,00 sec)
This table shows some of the metrics that are retrieved when the clustering module executes SHOW MYSQL STATUS
in its peers. Columns:
-
hostname
: hostname of the peer -
port
: port of the peer -
weight
: same as reported inproxysql_servers
.weight
-
comment
: same as reported inproxysql_servers
.comment
-
response_time_ms
: response time while runningSHOW MYSQL STATUS
, in millisecond -
Uptime_s
: peer's uptime in second -
last_check_ms
: age of the last time a check was executed, in millisecond -
Queries
: number of queries executed by the peer -
Client_Connections_connected
: number of client's connections connected -
Client_Connections_created
: number of client's connections created
Note:
All the status variables are retrieved by the peers, but only few are monitored to be able to check if the peer is up and running and processing traffic.
Currently this feature is useful only for debugging purpose, but future versions will use these metrics to understand the health of remote peers.
Currently unused
In the architecture described above, all nodes monitor all the other nodes. A fully mesh peer-to-peer network.
To reduce network usage, nodes do not always exchange the whole list of checksum: instead the exchange a single checksum resulting from combining all the versions and all the checksums. It this global checksum changed, a detailed list of checksums is retrieved.
Using this technique, a 200 nodes cluster monitoring each other every 1000ms, requires a bandwidth of 50KBpb in/out to/from each node.
Becauses proxies monitor each others, they can immediately know when a checksum of a confguration changed, that means that the configuration itself changes. If a configuration changed, it is checked against its own configuration because it is possible that the remote peer's configuration and its own configuration have changed at the same time, or within a short period of time. If they differ:
- if the own
version
is 1 , find the peer withversion > 1
and with the highest epoch, and sync immediately - if the own
version
is greater than 1, starts counting for how many checks they differ- when the number of checks in which they differ is greater than
cluster__name___diffs_before_sync
andcluster__name__diffs_before_sync
itself is greater than 0, find the peer withversion > 1
and with the highest epoch, and sync immediately (Note: it is possible that a different is detected against a node, but the sync is performed against a different node. Because the sync is done with the node with the highest epoch, it is expected that all the nodes will converge)
- when the number of checks in which they differ is greater than
The syncing process is performed as follow:
- the same connection used to perform the health check is used to execute a series of
SELECT
statements in the form ofSELECT _list_of_columns_ FROM runtime_module
. For example:
SELECT hostgroup_id, hostname, port, status, weight, compression, max_connections, max_replication_lag, use_ssl, max_latency_ms, comment FROM runtime_mysql_servers;
SELECT writer_hostgroup, reader_hostgroup, comment FROM runtime_mysql_replication_hostgroups;
- delete the local configuration tables. For example:
DELETE FROM mysql_servers;
DELETE FROM mysql_replication_hostgroups;
- insert into the local configuration tables what retrieved from remote peer
- issue an internal
LOAD module_name TO RUNTIME
: this will increase version number and create a new checksum - if
cluster__name__save_to_disk
istrue
, issue an internalSAVE module_name TO DISK
- add support for MySQL Group Replication
- add support for Scheduler
This is an overview of the features related to clustering, and not a complete list. None the following list is impletemented yet.
Implementation may be different than what listed right now:
- support for master election: the word master was intentionally chosen instead of leader
- only master proxy is writable/configurable
- implementation of MySQL-like replication from master to slaves, allowing to push configuration in real-time instead pulling it
- implementation of MySQL-like replication from master to candidate-masters
- implementation of MySQL-like replication from candidate-masters to slaves
- creation of a quorum with only candidate-masters: normal slaves are not part of the quorum
What if a different configuration is loaded at the same time on each of the proxysql servers, which configuration is the one that needs to be "propagated" to all other nodes ? The last one?
The concept of master and master election is not implemented yet. That means that a LOAD
command can be potentially be executed on multiple nodes at the same time (multi-master, to make some analogy), and each will trigger an automatic reconfiguration with timestamp based conflict-resolution.
If the same configuration is loaded at the same time on multiple proxysql instances, they should automatically converge.
If different configurations are loaded on multiple proxysql instances at different times, the last one will win.
If different configurations are loaded on multiple proxysql instances at the same time, the two configurations will start propagating till the point in which they won't converge as conflict resolution is not possible.
The good thing is that each ProxySQL knows the checksum of configuration of every other node, so mismatches are easy to detect and monitor.
Currently a pull mechanism is used, therefore the node that detects it needs to reconfigure itself will pull the configuration from the node with the most up-to-date configuration and apply it locally.
Implementation of election is in the roadmap, but probably not Raft consensus protocol.
ProxySQL uses tables to store configuration, it uses the MySQL protocol to perform requests to its peers querying their health and their configuration, it uses the MySQL protocol to implement heartbeat and much more: for these reasons, in the case of ProxySQL, the MySQL protocol itself might be a more versatile solution compared to Raft protocol.
What will happen if for some reason one of the nodes will be unable to grab the new configuration in an event of re-configuration?
Changes are propagated asynchronously. Therefore it is possible that one of nodes is not able to grab the new configuration, for example in case of network issue of ProxySQL being restarted. Yet, when a ProxySQL instance detects that one of its peers has a newer configuration, it will automatically grab it.
Cluster do not have boundaries, therefore it is possible to have a single cluster across multiple DCs, or to have multiple clusters in the same DC, or multiple clusters across multiple DCs. This really depends from the specific use case.
The only limitation is that each proxysql instance needs to belong to a single cluster.
Clusters do not have names, and to make sure that a node doesn't erroneously join the wrong cluster it is important to ensure that each cluster uses different credentials. See admin-admin_credentials
, admin-cluster_username
and admin-cluster_password
.
Could be a nice feature to somehow replicate the configuration crossdc but prefer traffic to the backend server that is closest to the local proxysql server. I am doing it now using weight.
For this specific case I think it makes more sense to create a different cluster for each DC, as configuration will be different.
Bootstrap is very easy: starts with at least 1 peer in proxysql_servers
.
They do not know it automatically, and this is intentional to prevent that a new node may corrupt the cluster.
In other words, a new node can pull the configuration as soon as it joins, but cannot advertise itself as the source of truth.
To let the other proxysql instances know that there is a new node, it is enough to add the new node in proxysql_servers
of any node of the current cluster and issue LOAD PROXYSQL SERVERS TO RUNTIME
.