Skip to content

Latest commit

 

History

History
686 lines (453 loc) · 46.5 KB

tidb-resource-control.md

File metadata and controls

686 lines (453 loc) · 46.5 KB
title summary
Use Resource Control to Achieve Resource Isolation
Learn how to use the resource control feature to control and schedule application resources.

Use Resource Control to Achieve Resource Isolation

Note:

This feature is not available on TiDB Cloud Serverless clusters.

As a cluster administrator, you can use the resource control feature to create resource groups, set quotas for resource groups, and bind users to those groups.

The TiDB resource control feature provides two layers of resource management capabilities: the flow control capability at the TiDB layer and the priority scheduling capability at the TiKV layer. The two capabilities can be enabled separately or simultaneously. See the Parameters for resource control for details. This allows the TiDB layer to control the flow of user read and write requests based on the quotas set for the resource groups, and allows the TiKV layer to schedule the requests based on the priority mapped to the read and write quota. By doing this, you can ensure resource isolation for your applications and meet quality of service (QoS) requirements.

  • TiDB flow control: TiDB flow control uses the token bucket algorithm. If there are not enough tokens in a bucket, and the resource group does not specify the BURSTABLE option, the requests to the resource group will wait for the token bucket to backfill the tokens and retry. The retry might fail due to timeout.

  • TiKV scheduling: You can set the absolute priority (PRIORITY) as needed. Different resources are scheduled according to the PRIORITY setting. Tasks with high PRIORITY are scheduled first. If you do not set the absolute priority, TiKV uses the value of RU_PER_SEC of each resource group to determine the priority of the read and write requests for each resource group. Based on the priorities, the storage layer uses the priority queue to schedule and process requests.

Starting from v7.4.0, the resource control feature supports controlling TiFlash resources. Its principle is similar to that of TiDB flow control and TiKV scheduling:

  • TiFlash flow control: With the TiFlash pipeline execution model, TiFlash can more accurately obtain the CPU consumption of different queries and convert it into Request Units (RU) for deduction. Traffic control is implemented using a token bucket algorithm.
  • TiFlash scheduling: When system resources are insufficient, TiFlash schedules pipeline tasks among multiple resource groups based on their priorities. The specific logic is: First, TiFlash assesses the PRIORITY of the resource group, then considers the CPU usage and RU_PER_SEC. As a result, if rg1 and rg2 have the same PRIORITY but the RU_PER_SEC of rg2 is twice that of rg1, the CPU usage of rg2 is twice that of rg1.
  • TiFlash flow control: With the TiFlash pipeline execution model, TiFlash can more accurately obtain the CPU consumption of different queries and convert it into Request Units (RU) for deduction. Traffic control is implemented using a token bucket algorithm.
  • TiFlash scheduling: When system resources are insufficient, TiFlash schedules pipeline tasks among multiple resource groups based on their priorities. The specific logic is: First, TiFlash assesses the PRIORITY of the resource group, then considers the CPU usage and RU_PER_SEC. As a result, if rg1 and rg2 have the same PRIORITY but the RU_PER_SEC of rg2 is twice that of rg1, the CPU usage of rg2 is twice that of rg1.

Scenarios for resource control

The introduction of the resource control feature is a milestone for TiDB. It can divide a distributed database cluster into multiple logical units. Even if an individual unit overuses resources, it does not crowd out the resources needed by other units.

With this feature, you can:

  • Combine multiple small and medium-sized applications from different systems into a single TiDB cluster. When the workload of an application grows larger, it does not affect the normal operation of other applications. When the system workload is low, busy applications can still be allocated the required system resources even if they exceed the set quotas, so as to achieve the maximum utilization of resources.
  • Choose to combine all test environments into a single TiDB cluster, or group the batch tasks that consume more resources into a single resource group. It can improve hardware utilization and reduce operating costs while ensuring that critical applications can always get the necessary resources.
  • When there are mixed workloads in a system, you can put different workloads into separate resource groups. By using the resource control feature, you can ensure that the response time of transactional applications is not affected by data analysis or batch applications.
  • When the cluster encounters an unexpected SQL performance issue, you can use SQL bindings along with resource groups to temporarily limit the resource consumption of a SQL statement.

In addition, the rational use of the resource control feature can reduce the number of clusters, ease the difficulty of operation and maintenance, and save management costs.

Note:

  • To assess the effectiveness of resource management, it is recommended to deploy the cluster on independent computing and storage nodes. Scheduling and other cluster resource-sensitive features are hardly working properly on the deployment created by tiup playground, where the resources are shared across instances.

Limitations

Resource control incurs additional scheduling overhead. Therefore, there might be a slight performance degradation (less than 5%) when this feature is enabled.

What is Request Unit (RU)

Request Unit (RU) is a unified abstraction unit in TiDB for system resources, which currently includes CPU, IOPS, and IO bandwidth metrics. It is used to indicate the amount of resources consumed by a single request to the database. The number of RUs consumed by a request depends on a variety of factors, such as the type of operations, and the amount of data being queried or modified. Currently, the RU contains consumption statistics for the resources in the following table:

Resource type RU consumption
Read 2 storage read batches consume 1 RU
8 storage read requests consume 1 RU
64 KiB read request payload consumes 1 RU
Write 1 storage write batch consumes 1 RU
1 storage write request consumes 1 RU
1 KiB write request payload consumes 1 RU
CPU 3 ms consumes 1 RU

Note:

  • Each write operation is eventually replicated to all replicas (by default, TiKV has 3 replicas). Each replication operation is considered a different write operation.
  • The preceding table lists only the resources involved in RU calculation for TiDB Self-Managed clusters, excluding the network and storage consumption. For TiDB Cloud Serverless RUs, see TiDB Cloud Serverless Pricing Details.
  • Currently, TiFlash resource control only considers SQL CPU, which is the CPU time consumed by the execution of pipeline tasks for queries, and read request payload.

Parameters for resource control

The resource control feature introduces the following system variables or parameters:

  • TiKV: For TiDB Self-Managed, you can use the resource-control.enabled parameter to control whether to use request scheduling based on resource group quotas. For TiDB Cloud, the value of the resource-control.enabled parameter is true by default and does not support dynamic modification.
  • TiFlash: For TiDB Self-Managed, you can use the tidb_enable_resource_control system variable and the enable_resource_control configuration item (introduced in v7.4.0) to control whether to enable TiFlash resource control.

Starting from TiDB v7.0.0, tidb_enable_resource_control and resource-control.enabled are enabled by default. The results of the combinations of these two parameters are shown in the following table.

resource-control.enabled tidb_enable_resource_control= ON tidb_enable_resource_control= OFF
resource-control.enabled= true Flow control and scheduling (recommended) Invalid combination
resource-control.enabled= false Only flow control (not recommended) The feature is disabled.

Starting from v7.4.0, the TiFlash configuration item enable_resource_control is enabled by default. It works together with tidb_enable_resource_control to control the TiFlash resource control feature. TiFlash resource control only performs flow control and priority scheduling when both enable_resource_control and tidb_enable_resource_control are enabled. Additionally, when enable_resource_control is enabled, TiFlash uses the Pipeline execution model.

Starting from v7.4.0, the TiFlash configuration item enable_resource_control is enabled by default. It works together with tidb_enable_resource_control to control the TiFlash resource control feature. TiFlash resource control only performs flow control and priority scheduling when both enable_resource_control and tidb_enable_resource_control are enabled. Additionally, when enable_resource_control is enabled, TiFlash uses the Pipeline execution model.

For more information about the resource control mechanism and parameters, see RFC: Global Resource Control in TiDB and TiFlash Resource Control.

How to use resource control

This section describes how to use the resource control feature to manage resource groups and control the resource allocation of each resource group.

Estimate cluster capacity

Before resource planning, you need to know the overall capacity of the cluster. TiDB provides the statement CALIBRATE RESOURCE to estimate the cluster capacity. You can use one of the following methods:

You can view the Resource Manager page in TiDB Dashboard. For more information, see CALIBRATE RESOURCE.

For TiDB Self-Managed, you can use the CALIBRATE RESOURCE statement to estimate the cluster capacity.

For TiDB Cloud, the CALIBRATE RESOURCE statement is inapplicable.

Manage resource groups

To create, modify, or delete a resource group, you need to have the SUPER or RESOURCE_GROUP_ADMIN privilege.

You can create a resource group for a cluster by using CREATE RESOURCE GROUP.

For an existing resource group, you can modify the RU_PER_SEC option (the rate of RU backfilling per second) of the resource group by using ALTER RESOURCE GROUP. The changes to the resource group take effect immediately.

You can delete a resource group by using DROP RESOURCE GROUP.

Create a resource group

The following is an example of how to create a resource group.

  1. Create a resource group rg1. The resource limit is 500 RUs per second and allows applications in this resource group to overrun resources.

    CREATE RESOURCE GROUP IF NOT EXISTS rg1 RU_PER_SEC = 500 BURSTABLE;
  2. Create a resource group rg2. The RU backfill rate is 600 RUs per second and does not allow applications in this resource group to overrun resources.

    CREATE RESOURCE GROUP IF NOT EXISTS rg2 RU_PER_SEC = 600;
  3. Create a resource group rg3 with the absolute priority set to HIGH. The absolute priority currently supports LOW|MEDIUM|HIGH. The default value is MEDIUM.

    CREATE RESOURCE GROUP IF NOT EXISTS rg3 RU_PER_SEC = 100 PRIORITY = HIGH;

Bind resource groups

TiDB supports three levels of resource group settings as follows.

  • User level. Bind a user to a specific resource group via the CREATE USER or ALTER USER statement. After a user is bound to a resource group, sessions created by the user are automatically bound to the corresponding resource group.
  • Session level. Set the resource group for the current session via SET RESOURCE GROUP.
  • Statement level. Set the resource group for the current statement via RESOURCE_GROUP() Optimizer Hint.

Bind users to a resource group

The following example creates a user usr1 and binds the user to the resource group rg1. rg1 is the resource group created in the example in Create Resource Group.

CREATE USER 'usr1'@'%' IDENTIFIED BY '123' RESOURCE GROUP rg1;

The following example uses ALTER USER to bind the user usr2 to the resource group rg2. rg2 is the resource group created in the example in Create Resource Group.

ALTER USER usr2 RESOURCE GROUP rg2;

After you bind users, the resource consumption of newly created sessions will be controlled by the specified quota (Request Unit, RU). If the system workload is relatively high and there is no spare capacity, the resource consumption rate of usr2 will be strictly controlled not to exceed the quota. Because usr1 is bound by rg1 with BURSTABLE configured, the consumption rate of usr1 is allowed to exceed the quota.

If there are too many requests that result in insufficient resources for the resource group, the client's requests will wait. If the wait time is too long, the requests will report an error.

Note:

  • When you bind a user to a resource group by using CREATE USER or ALTER USER, it will not take effect for the user's existing sessions, but only for the user's new sessions.
  • TiDB automatically creates a default resource group during cluster initialization. For this resource group, the default value of RU_PER_SEC is UNLIMITED (equivalent to the maximum value of the INT type, that is, 2147483647) and it is in BURSTABLE mode. Statements that are not bound to a resource group are automatically bound to this resource group. This resource group does not support deletion, but you can modify the configuration of its RU.

To unbind users from a resource group, you can simply bind them to the default group again as follows:

ALTER USER 'usr3'@'%' RESOURCE GROUP `default`;

For more details, see ALTER USER ... RESOURCE GROUP.

Bind the current session to a resource group

You can use the SET RESOURCE GROUP statement to change the bound resource group of the current session. By binding a session to a resource group, the resource usage of the corresponding session is limited by the specified usage (RU).

When the system variable tidb_resource_control_strict_mode is set to ON, you need to have the SUPER or RESOURCE_GROUP_ADMIN or RESOURCE_GROUP_USER privilege to execute this statement.

The following example binds the current session to the resource group rg1.

SET RESOURCE GROUP rg1;

Bind the current statement to a resource group

By adding the RESOURCE_GROUP(resource_group_name) hint to a SQL statement, you can specify the resource group to which the statement is bound. This hint supports SELECT, INSERT, UPDATE, and DELETE statements.

When the system variable tidb_resource_control_strict_mode is set to ON, you need to have the SUPER or RESOURCE_GROUP_ADMIN or RESOURCE_GROUP_USER privilege to use this hint.

The following example binds the current statement to the resource group rg1.

SELECT /*+ RESOURCE_GROUP(rg1) */ * FROM t limit 10;

Manage queries that consume more resources than expected (Runaway Queries)

A runaway query is a query (SELECT statement only) that consumes more time or resources than expected. The term runaway queries is used in the following to describe the feature of managing the runaway query.

  • Starting from v7.2.0, the resource control feature introduces the management of runaway queries. You can set criteria for a resource group to identify runaway queries and automatically take actions to prevent them from exhausting resources and affecting other queries. You can manage runaway queries for a resource group by including the QUERY_LIMIT field in CREATE RESOURCE GROUP or ALTER RESOURCE GROUP.
  • Starting from v7.3.0, the resource control feature introduces manual management of runaway watches, enabling quick identification of runaway queries for a given SQL statement or Digest. You can execute the statement QUERY WATCH to manually manage the runaway queries watch list in the resource group.

QUERY_LIMIT parameters

If a query exceeds any of the following limits, it is identified as a runaway query:

  • EXEC_ELAPSED: checks whether the query execution time exceeds the limit.
  • PROCESSED_KEYS: checks whether the number of keys processed by the Coprocessor exceeds the limit.
  • RU: checks whether the total number of read and write RUs consumed by the statement exceeds the limit.

Supported operations (ACTION):

  • DRYRUN: no action is taken. The records are appended for the runaway queries. This is mainly used to observe whether the condition setting is reasonable.
  • COOLDOWN: the execution priority of the query is lowered to the lowest level. The query continues to execute with the lowest priority and does not occupy resources of other operations.
  • KILL: the identified query is automatically terminated and reports an error Query execution was interrupted, identified as runaway query.
  • SWITCH_GROUP: introduced in v8.4.0, this parameter switches the identified query to the specified resource group for continued execution. After this query completes, subsequent SQL statements are executed in the original resource group. If the specified resource group does not exist, the query remains in the original resource group.

To avoid too many concurrent runaway queries that exhaust system resources, the resource control feature introduces a quick identification mechanism, which can quickly identify and isolate runaway queries. You can use this feature through the WATCH clause. When a query is identified as a runaway query, this mechanism extracts the matching feature (defined by the parameter after WATCH) of the query. In the next period of time (defined by DURATION), the matching feature of the runaway query is added to the watch list, and the TiDB instance matches queries with the watch list. The matching queries are directly marked as runaway queries and isolated according to the corresponding action, instead of waiting for them to be identified by conditions. The KILL operation terminates the query and reports an error Quarantined and interrupted because of being in runaway watch list.

There are three methods for WATCH to match for quick identification:

  • EXACT indicates that only SQL statements with exactly the same SQL text are quickly identified.
  • SIMILAR indicates all SQL statements with the same pattern are matched by Plan Digest, and the literal values are ignored.
  • PLAN indicates all SQL statements with the same pattern are matched by Plan Digest.

The DURATION option in WATCH indicates the duration of the identification item, which is infinite by default.

After a watch item is added, neither the matching feature nor the ACTION is changed or deleted whenever the QUERY_LIMIT configuration is changed or deleted. You can use QUERY WATCH REMOVE to remove a watch item.

The parameters of QUERY_LIMIT are as follows:

Parameter Description Note
EXEC_ELAPSED When the query execution time exceeds this value, it is identified as a runaway query EXEC_ELAPSED =60s means the query is identified as a runaway query if it takes more than 60 seconds to execute.
PROCESSED_KEYS When the number of keys processed by the Coprocessor exceeds this value, the query is identified as a runaway query PROCESSED_KEYS = 1000 means the query is identified as a runaway query if the number of keys processed by the Coprocessor exceeds 1000.
RU When the total number of read and write RUs consumed by the query exceeds this value, this query is identified as a runaway query RU = 1000 means the query is identified as a runaway query if the total number of read and write RUs consumed by the query exceeds 1000.
ACTION Action taken when a runaway query is identified The optional values are DRYRUN, COOLDOWN, KILL, and SWITCH_GROUP.
WATCH Quickly match the identified runaway query. If the same or similar query is encountered again within a certain period of time, the corresponding action is performed immediately. Optional. For example, WATCH=SIMILAR DURATION '60s', WATCH=EXACT DURATION '1m', and WATCH=PLAN.

Note:

If you want to strictly limit runaway queries to a specific resource group, it is recommended to use SWITCH_GROUP together with the QUERY WATCH statement. Because QUERY_LIMIT only triggers the corresponding ACTION operation when the query meets the criteria, SWITCH_GROUP might not be able to switch the query to the target resource group in a timely manner in such scenarios.

Examples

  1. Create a resource group rg1 with a quota of 500 RUs per second, and define a runaway query as one that exceeds 60 seconds, and lower the priority of the runaway query.

    CREATE RESOURCE GROUP IF NOT EXISTS rg1 RU_PER_SEC = 500 QUERY_LIMIT=(EXEC_ELAPSED='60s', ACTION=COOLDOWN);
  2. Change the rg1 resource group to terminate the runaway queries, and mark the queries with the same pattern as runaway queries immediately in the next 10 minutes.

    ALTER RESOURCE GROUP rg1 QUERY_LIMIT=(EXEC_ELAPSED='60s', ACTION=KILL, WATCH=SIMILAR DURATION='10m');
  3. Change the rg1 resource group to cancel the runaway query check.

    ALTER RESOURCE GROUP rg1 QUERY_LIMIT=NULL;

QUERY WATCH parameters

For more information about the synopsis of QUERY WATCH, see QUERY WATCH.

The parameters are as follows:

  • The RESOURCE GROUP specifies a resource group. The matching features of runaway queries added by this statement are added to the watch list of the resource group. This parameter can be omitted. If omitted, it applies to the default resource group.

  • The meaning of ACTION is the same as QUERY LIMIT. This parameter can be omitted. If omitted, the corresponding action after identification adopts the ACTION configured by QUERY LIMIT in the resource group, and the action does not change with the QUERY LIMIT configuration. If there is no ACTION configured in the resource group, an error is reported.

  • The QueryWatchTextOption parameter has three options: SQL DIGEST, PLAN DIGEST, and SQL TEXT.

    • SQL DIGEST is the same as that of SIMILAR. The following parameters accept strings, user-defined variables, or other expressions that yield string result. The string length must be 64, which is the same as the Digest definition in TiDB.
    • PLAN DIGEST is the same as PLAN. The following parameter is a Digest string.
    • SQL TEXT matches the input SQL as a raw string (EXACT), or parses and compiles it into SQL DIGEST (SIMILAR) or PLAN DIGEST (PLAN), depending on the following parameter.
  • Add a matching feature to the runaway query watch list for the default resource group (you need to set QUERY LIMIT for the default resource group in advance).

    QUERY WATCH ADD ACTION KILL SQL TEXT EXACT TO 'select * from test.t2';
  • Add a matching feature to the runaway query watch list for the rg1 resource group by parsing the SQL into SQL Digest. When ACTION is not specified, the ACTION option already configured for the rg1 resource group is used.

    QUERY WATCH ADD RESOURCE GROUP rg1 SQL TEXT SIMILAR TO 'select * from test.t2';
  • Add a matching feature to the runaway query watch list for the rg1 resource group by parsing the SQL into SQL Digest, and specify ACTION as SWITCH_GROUP(rg2).

    QUERY WATCH ADD RESOURCE GROUP rg1 ACTION SWITCH_GROUP(rg2) SQL TEXT SIMILAR TO 'select * from test.t2';
  • Add a matching feature to the runaway query watch list for the rg1 resource group using PLAN DIGEST, and specify ACTION as KILL.

    QUERY WATCH ADD RESOURCE GROUP rg1 ACTION KILL PLAN DIGEST 'd08bc323a934c39dc41948b0a073725be3398479b6fa4f6dd1db2a9b115f7f57';
  • Get the watch item ID by querying INFORMATION_SCHEMA.RUNAWAY_WATCHES and delete the watch item.

    SELECT * from information_schema.runaway_watches ORDER BY id\G
    *************************** 1. row ***************************
                     ID: 1
    RESOURCE_GROUP_NAME: default
             START_TIME: 2024-09-09 03:35:31
               END_TIME: 2024-09-09 03:45:31
                  WATCH: Exact
            WATCH_TEXT: SELECT variable_name, variable_value FROM mysql.global_variables
                 SOURCE: 127.0.0.1:4000
                ACTION: Kill
                RULE: ProcessedKeys = 666(10)
    1 row in set (0.00 sec)
    QUERY WATCH REMOVE 1;

Observability

You can get more information about runaway queries from the following system tables and INFORMATION_SCHEMA:

  • The mysql.tidb_runaway_queries table contains the history records of all runaway queries identified in the past 7 days. Take one of the rows as an example:

    MySQL [(none)]> SELECT * FROM mysql.tidb_runaway_queries LIMIT 1\G
    *************************** 1. row ***************************
    resource_group_name: default
         start_time: 2024-09-09 17:43:42
            repeats: 2
         match_type: watch
             action: kill
         sample_sql: select sleep(2) from t
         sql_digest: 4adbc838b86c573265d4b39a3979d0a362b5f0336c91c26930c83ab187701a55
        plan_digest: 5d094f78efbce44b2923733b74e1d09233cb446318293492901c5e5d92e27dbc
        tidb_server: 127.0.0.1:4000

    Field description:

    • start_time indicates the time when the runaway query is identified.
    • repeats indicates the number of times the runaway query has been identified since start_time.
    • match_type indicates how the runaway query is identified. The value can be one of the following:
      • identify means that it matches the condition of the runaway query.
      • watch means that it matches the quick identification rule in the watch list.
  • The information_schema.runaway_watches table contains records of quick identification rules for runaway queries. For more information, see RUNAWAY_WATCHES.

Manage background tasks

Warning:

This feature is experimental. It is not recommended that you use it in the production environment. This feature might be changed or removed without prior notice. If you find a bug, you can report an issue on GitHub.

The background task management in resource control is based on TiKV's dynamic adjustment of resource quotas for CPU/IO utilization. Therefore, it relies on the available resource quota of each instance. If multiple components or instances are deployed on a single server, it is mandatory to set the appropriate resource quota for each instance through cgroup. It is difficult to achieve the expected effect in deployment with shared resources like TiUP Playground.

Background tasks, such as data backup and automatic statistics collection, are low-priority but consume many resources. These tasks are usually triggered periodically or irregularly. During execution, they consume a lot of resources, thus affecting the performance of online high-priority tasks.

Starting from v7.4.0, the TiDB resource control feature supports managing background tasks. When a task is marked as a background task, TiKV dynamically limits the resources used by this type of task to avoid the impact on the performance of other foreground tasks. TiKV monitors the CPU and IO resources consumed by all foreground tasks in real time, and calculates the resource threshold that can be used by background tasks based on the total resource limit of the instance. All background tasks are restricted by this threshold during execution.

BACKGROUND parameters

  • TASK_TYPES: specifies the task types that need to be managed as background tasks. Use commas (,) to separate multiple task types.
  • UTILIZATION_LIMIT: limits the maximum percentage (0-100) of resources that background tasks can consume on each TiKV node. By default, TiKV calculates the available resources for background tasks based on the total resources of the node and the resources currently occupied by the foreground tasks. If UTILIZATION_LIMIT is configured, the resource allocated to background tasks will not exceed this limit.

TiDB supports the following types of background tasks:

  • lightning: perform import tasks using TiDB Lightning. Both physical and logical import modes of TiDB Lightning are supported.
  • br: perform backup and restore tasks using BR. PITR is not supported.
  • ddl: control the resource usage during the batch data write back phase of Reorg DDLs.
  • stats: the collect statistics tasks that are manually executed or automatically triggered by TiDB.
  • background: a reserved task type. You can use the tidb_request_source_type system variable to specify the task type of the current session as background.
  • lightning: perform import tasks using TiDB Lightning. Both physical and logical import modes of TiDB Lightning are supported.
  • br: perform backup and restore tasks using BR. PITR is not supported.
  • ddl: control the resource usage during the batch data write back phase of Reorg DDLs.
  • stats: the collect statistics tasks that are manually executed or automatically triggered by TiDB.
  • background: a reserved task type. You can use the tidb_request_source_type system variable to specify the task type of the current session as background.

By default, the task types that are marked as background tasks are "", and the management of background tasks is disabled. To enable background task management, you need to manually modify the background task type of the default resource group. After a background task is identified and matched, Resource Control is automatically performed. This means that when system resources are insufficient, the background tasks are automatically reduced to the lowest priority to ensure the execution of foreground tasks.

Note:

Currently, background tasks for all resource groups are bound to the default resource group. You can manage background task types globally through default. Binding background tasks to other resource groups is currently not supported.

Examples

  1. Modify the default resource group by marking br and ddl as background tasks and setting the resource limit of background tasks to 30%.

    ALTER RESOURCE GROUP `default` BACKGROUND=(TASK_TYPES='br,ddl', UTILIZATION_LIMIT=30);
  2. Change the default resource group to revert the background task type to its default value.

    ALTER RESOURCE GROUP `default` BACKGROUND=NULL;
  3. Change the default resource group to set the background task type to empty. In this case, all tasks of this resource group are not treated as background tasks.

    ALTER RESOURCE GROUP `default` BACKGROUND=(TASK_TYPES="");
  4. View the background task type of the default resource group.

    SELECT * FROM information_schema.resource_groups WHERE NAME="default";

    The output is as follows:

    +---------+------------+----------+-----------+-------------+-------------------------------------------+
    | NAME    | RU_PER_SEC | PRIORITY | BURSTABLE | QUERY_LIMIT | BACKGROUND                                |
    +---------+------------+----------+-----------+-------------+-------------------------------------------+
    | default | UNLIMITED  | MEDIUM   | YES       | NULL        | TASK_TYPES='br,ddl', UTILIZATION_LIMIT=30 |
    +---------+------------+----------+-----------+-------------+-------------------------------------------+
    
  5. To explicitly mark tasks in the current session as the background type, you can use tidb_request_source_type to explicitly specify the task type. The following is an example:

    SET @@tidb_request_source_type="background";
    /* Add background task type */
    ALTER RESOURCE GROUP `default` BACKGROUND=(TASK_TYPES="background");
    /* Execute LOAD DATA in the current session */
    LOAD DATA INFILE "s3://resource-control/Lightning/test.customer.aaaa.csv"

Disable resource control

  1. Execute the following statement to disable the resource control feature.

    SET GLOBAL tidb_enable_resource_control = 'OFF';
  2. Set the TiKV parameter resource-control.enabled to false to disable scheduling based on the RU of the resource group.

  3. Set the TiFlash configuration item enable_resource_control to false to disable TiFlash resource control.

  1. Execute the following statement to disable the resource control feature.

    SET GLOBAL tidb_enable_resource_control = 'OFF';
  2. For TiDB Self-Managed, you can use the resource-control.enabled parameter to control whether to use request scheduling based on resource group quotas. For TiDB Cloud, the value of the resource-control.enabled parameter is true by default and does not support dynamic modification. If you need to disable it for TiDB Cloud Dedicated clusters, contact TiDB Cloud Support.

  3. For TiDB Self-Managed, you can use the enable_resource_control configuration item to control whether to enable TiFlash resource control. For TiDB Cloud, the value of the enable_resource_control parameter is true by default and does not support dynamic modification. If you need to disable it for TiDB Cloud Dedicated clusters, contact TiDB Cloud Support.

View RU consumption

You can view information about RU consumption.

View the RU consumption by SQL

You can view the RU consumption of SQL statements in the following ways:

  • The system variable tidb_last_query_info
  • EXPLAIN ANALYZE
  • Slow queries and corresponding system table
  • statements_summary

View the RUs consumed by the last SQL execution by querying the system variable tidb_last_query_info

TiDB provides the system variable tidb_last_query_info. This system variable records the information of the last DML statement executed, including the RUs consumed by the SQL execution.

Example:

  1. Run the UPDATE statement:

    UPDATE sbtest.sbtest1 SET k = k + 1 WHERE id = 1;
    Query OK, 1 row affected (0.01 sec)
    Rows matched: 1  Changed: 1  Warnings: 0
    
  2. Query the system variable tidb_last_query_info to view the information of the last executed statement:

    SELECT @@tidb_last_query_info;
    +------------------------------------------------------------------------------------------------------------------------+
    | @@tidb_last_query_info                                                                                                 |
    +------------------------------------------------------------------------------------------------------------------------+
    | {"txn_scope":"global","start_ts":446809472210829315,"for_update_ts":446809472210829315,"ru_consumption":4.34885578125} |
    +------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.01 sec)
    

    In the result, ru_consumption is the RUs consumed by the execution of this SQL statement.

View RUs consumed during SQL execution by EXPLAIN ANALYZE

You can use the EXPLAIN ANALYZE statement to get the amount of RUs consumed during SQL execution. Note that the amount of RUs is affected by the cache (for example, coprocessor cache). When the same SQL is executed multiple times, the amount of RUs consumed by each execution might be different. The RU value does not represent the exact value for each execution, but can be used as a reference for estimation.

Slow queries and the corresponding system table

When you enable resource control, the slow query log of TiDB and the corresponding system table INFORMATION_SCHEMA.SLOW_QUERY contain the resource group, RU consumption of the corresponding SQL, and the time spent waiting for available RUs.

When you enable resource control, the system table INFORMATION_SCHEMA.SLOW_QUERY contains the resource group, RU consumption of the corresponding SQL, and the time spent waiting for available RUs.

View RU statistics by statements_summary

The system table INFORMATION_SCHEMA.statements_summary in TiDB stores the normalized and aggregated statistics of SQL statements. You can use the system table to view and analyze the execution performance of SQL statements. It also contains statistics about resource control, including the resource group name, RU consumption, and the time spent waiting for available RUs. For more details, see statements_summary fields description.

View the RU consumption of resource groups

Starting from v7.6.0, TiDB provides the system table mysql.request_unit_by_group to store the historical records of the RU consumption of each resource group.

Example:

SELECT * FROM request_unit_by_group LIMIT 5;
+----------------------------+----------------------------+----------------+----------+
| start_time                 | end_time                   | resource_group | total_ru |
+----------------------------+----------------------------+----------------+----------+
| 2024-01-01 00:00:00.000000 | 2024-01-02 00:00:00.000000 | default        |   334147 |
| 2024-01-01 00:00:00.000000 | 2024-01-02 00:00:00.000000 | rg1            |     4172 |
| 2024-01-01 00:00:00.000000 | 2024-01-02 00:00:00.000000 | rg2            |    34028 |
| 2024-01-02 00:00:00.000000 | 2024-01-03 00:00:00.000000 | default        |   334088 |
| 2024-01-02 00:00:00.000000 | 2024-01-03 00:00:00.000000 | rg1            |     3850 |
+----------------------------+----------------------------+----------------+----------+
5 rows in set (0.01 sec)

Note:

The data of mysql.request_unit_by_group is automatically imported by a TiDB scheduled task at the end of each day. If the RU consumption of a resource group is 0 on a certain day, no record is generated. By default, this table stores data for the last three months (up to 92 days). Data that exceeds this period is automatically cleared.

Monitoring metrics and charts

TiDB regularly collects runtime information about resource control and provides visual charts of the metrics in Grafana's TiDB > Resource Control dashboard. The metrics are detailed in the Resource Control section of TiDB Important Monitoring Metrics.

TiKV also records the request QPS from different resource groups. For more details, see TiKV Monitoring Metrics Detail.

You can view the data of resource groups in the current RESOURCE_GROUPS table in TiDB Dashboard. For more details, see Resource Manager page.

Note:

This section is only applicable to TiDB Self-Managed. Currently, TiDB Cloud does not provide resource control metrics.

TiDB regularly collects runtime information about resource control and provides visual charts of the metrics in Grafana's TiDB > Resource Control dashboard.

TiKV also records the request QPS from different resource groups in Grafana's TiKV dashboard.

Tool compatibility

The resource control feature does not impact the regular usage of data import, export, and other replication tools. BR, TiDB Lightning, and TiCDC do not currently support processing DDL operations related to resource control, and their resource consumption is not limited by resource control.

FAQ

  1. Do I have to disable resource control if I don't want to use resource groups?

    No. Users who do not specify any resource groups will be bound to the default resource group that has unlimited resources. When all users belong to the default resource group, the resource allocation method is the same as when the resource control is disabled.

  2. Can a database user be bound to several resource groups?

    No. A database user can only be bound to one resource group. However, during the session runtime, you can use SET RESOURCE GROUP to set the resource group used by the current session. You can also use the optimizer hint RESOURCE_GROUP() to set the resource group for the running statement.

  3. What happens when the total resource allocation (RU_PER_SEC) of all resource groups exceeds the system capacity?

    TiDB does not verify the capacity when you create a resource group. As long as the system has enough available resources, TiDB can meet the resource requirements of each resource group. When the system resources exceed the limit, TiDB prioritizes satisfying requests from resource groups with higher priority. If requests with the same priority cannot all be met, TiDB allocates resources proportionally according to the resource allocation (RU_PER_SEC).

See also