Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CUBRIDMAN-224] Manual for correlated subquery result cache #526

Merged
merged 71 commits into from
Aug 12, 2024
Merged
Show file tree
Hide file tree
Changes from 41 commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
eb9d411
add subquerycache related texts
xmilex-git Jun 13, 2024
1bc152f
add example
xmilex-git Jun 20, 2024
c691256
add english ver
xmilex-git Jun 20, 2024
e6f1925
vscode fix
xmilex-git Jun 20, 2024
6f9fe61
update english
xmilex-git Jun 20, 2024
ac47dd5
add blank line
xmilex-git Jun 20, 2024
2b350fa
move to alphabetical loc
xmilex-git Jun 20, 2024
65d5067
apply code review(1)
xmilex-git Jun 20, 2024
b40818b
Update ko/sql/tuning.rst
xmilex-git Jun 21, 2024
53ee79b
add examples
xmilex-git Jun 21, 2024
999b822
Update en/sql/tuning.rst
xmilex-git Jun 21, 2024
c419e5a
Update en/sql/tuning.rst
xmilex-git Jun 21, 2024
f9f62f5
Update en/sql/tuning.rst
xmilex-git Jun 21, 2024
b410a8c
apply code review (2)
xmilex-git Jun 21, 2024
5286bc7
apply code review (leading hint)
xmilex-git Jun 28, 2024
cfb8c2f
add connect by
xmilex-git Jul 18, 2024
38b4006
Update en/sql/tuning.rst
xmilex-git Aug 2, 2024
b1e29ef
Update ko/admin/config.rst
xmilex-git Aug 2, 2024
2174934
Update ko/sql/tuning.rst
xmilex-git Aug 2, 2024
2619430
Update en/sql/tuning.rst
xmilex-git Aug 2, 2024
f3cf032
Update ko/sql/tuning.rst
xmilex-git Aug 2, 2024
10d1fe0
Update ko/sql/tuning.rst
xmilex-git Aug 2, 2024
0344207
Update ko/sql/tuning.rst
xmilex-git Aug 2, 2024
7285854
Update en/sql/tuning.rst
xmilex-git Aug 2, 2024
4758dee
Update en/sql/tuning.rst
xmilex-git Aug 2, 2024
20b0ed4
Update ko/admin/config.rst
xmilex-git Aug 2, 2024
ad256ca
Update en/admin/config.rst
xmilex-git Aug 2, 2024
4f742e6
Update ko/sql/tuning.rst
xmilex-git Aug 2, 2024
466297a
Update ko/sql/tuning.rst
xmilex-git Aug 2, 2024
51e3b3e
Update ko/sql/tuning.rst
xmilex-git Aug 2, 2024
740ffd7
Update en/admin/config.rst
xmilex-git Aug 2, 2024
a671d83
Update ko/admin/config.rst
xmilex-git Aug 2, 2024
2e98753
Update ko/sql/tuning.rst
xmilex-git Aug 2, 2024
b85b56c
Update ko/sql/tuning.rst
xmilex-git Aug 2, 2024
951ab7b
Update ko/sql/tuning.rst
xmilex-git Aug 2, 2024
30d7584
Update en/sql/tuning.rst
xmilex-git Aug 2, 2024
0f15d76
Update en/sql/tuning.rst
xmilex-git Aug 2, 2024
0c57ec6
Update en/sql/tuning.rst
xmilex-git Aug 2, 2024
9cb1bfb
applying code review
xmilex-git Aug 7, 2024
ac62556
단위 설명
xmilex-git Aug 7, 2024
2667a5e
문맥 조정
xmilex-git Aug 7, 2024
57d2642
순서 조정
xmilex-git Aug 7, 2024
722a825
apply code review
xmilex-git Aug 8, 2024
1f6c14a
Update ko/sql/tuning.rst
xmilex-git Aug 8, 2024
20f38cb
Update ko/sql/tuning.rst
xmilex-git Aug 8, 2024
c2f7c50
Update ko/sql/tuning.rst
xmilex-git Aug 8, 2024
b555a30
Update ko/sql/tuning.rst
xmilex-git Aug 8, 2024
09506ee
Update ko/sql/tuning.rst
xmilex-git Aug 8, 2024
927f4b6
Update ko/sql/tuning.rst
xmilex-git Aug 8, 2024
c6f5d3d
Update ko/sql/tuning.rst
xmilex-git Aug 8, 2024
6db90ff
Update ko/sql/tuning.rst
xmilex-git Aug 8, 2024
af28b8d
Update ko/sql/tuning.rst
xmilex-git Aug 8, 2024
faca4dc
Update ko/sql/tuning.rst
xmilex-git Aug 8, 2024
8edf01b
Update ko/sql/tuning.rst
xmilex-git Aug 8, 2024
8d2c092
Update ko/sql/tuning.rst
xmilex-git Aug 8, 2024
2c72cfd
apply code review (2)
xmilex-git Aug 8, 2024
058e25d
코드 리뷰 영문 적용
xmilex-git Aug 8, 2024
93e1634
Update ko/sql/tuning.rst
xmilex-git Aug 8, 2024
61acafa
Update ko/sql/tuning.rst
xmilex-git Aug 8, 2024
e56c955
Update ko/sql/tuning.rst
xmilex-git Aug 8, 2024
f397238
코드리뷰 영문적용
xmilex-git Aug 8, 2024
359594a
Merge branch 'CUBRID:develop' into CUBRIDMAN-224
xmilex-git Aug 8, 2024
74cd8cd
Update en/admin/config.rst
xmilex-git Aug 9, 2024
3164386
Update en/sql/tuning.rst
xmilex-git Aug 9, 2024
6d577e7
Update en/sql/tuning.rst
xmilex-git Aug 9, 2024
f6cddc3
Update ko/sql/tuning.rst
xmilex-git Aug 9, 2024
7f78de2
Apply suggestions from code review
xmilex-git Aug 9, 2024
33eadb8
Apply suggestions from code review
xmilex-git Aug 9, 2024
c6f0c49
질의 프로파일링 관련 링크 추가
xmilex-git Aug 9, 2024
c000bee
Apply suggestions from code review
xmilex-git Aug 9, 2024
fdd8714
Apply suggestions from code review
xmilex-git Aug 9, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions en/admin/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,8 @@ On the below table, if "Applied" is "server parameter", that parameter affects t
| +-------------------------------------+-------------------------+---------+----------+--------------------------------+-----------------------+
| | max_hash_list_scan_size | server parameter | | byte | 8,388,608(8M) | |
| +-------------------------------------+-------------------------+---------+----------+--------------------------------+-----------------------+
| | max_subquery_cache_size | server parameter | | byte | 2,097,152(2M) | DBA only |
| +-------------------------------------+-------------------------+---------+----------+--------------------------------+-----------------------+
| | sort_buffer_size | server parameter | | byte | 128 * | |
| | | | | | :ref:`db_page_size <dpg>` | |
| +-------------------------------------+-------------------------+---------+----------+--------------------------------+-----------------------+
Expand Down Expand Up @@ -674,6 +676,8 @@ The following are parameters related to the memory used by the database server o
+--------------------------------+--------+---------------------------+---------------------------+---------------------------+
| max_hash_list_scan_size | byte | 8,388,608(8M) | 0 | 128MB |
+--------------------------------+--------+---------------------------+---------------------------+---------------------------+
| max_subquery_cache_size | byte | 2,097,152(2M) | 0 | 16,777,216(16M) |
+--------------------------------+--------+---------------------------+---------------------------+---------------------------+
| sort_buffer_size | byte | 128 * | 1 * | 2G(32bit), |
| | | :ref:`db_page_size <dpg>` | :ref:`db_page_size <dpg>` | INT_MAX * |
| | | | | :ref:`db_page_size <dpg>` |
Expand Down Expand Up @@ -714,6 +718,12 @@ The following are parameters related to the memory used by the database server o

If this parameter is set to 0 or If :ref:`NO_HASH_LIST_SCAN <no-hash-list-scan>` hint is specified, hash list scan will not be used.

**max_subquery_cache_size**

**max_subquery_cache_size** parameter is used to set the size of the subquery cache (correlated). You can set a unit as B, K, M, G or T, which stand for bytes, kilobytes (KB), megabytes (MB), gigabytes (GB), and terabytes (TB) respectively. If you omit the unit, bytes will be applied. The default value is **2,097,152** (2M) bytes, the minimum value is **0**, and the maximum value is **16,777,216** (16M) bytes. The subquery cache is allocated for the number of subqueries in a query and is deallocated when the main query is completed.

If max_subquery_cache_size is set to 0, the :ref:`NO_SUBQUERY_CACHE <correlated-subquery-cache>` hint is specified, or there is insufficient storage space, the subquery cache is not used.
xmilex-git marked this conversation as resolved.
Show resolved Hide resolved

**sort_buffer_size**

**sort_buffer_size** is a parameter to configure the size of buffer to be used when a query is processing sorting. The server assigns one sort buffer for each client's sorting-request, and releases the assigned buffer memory when sorting is complete. A sorting query includes not only SELECT sorting query, but also index-creating query.
Expand Down
192 changes: 192 additions & 0 deletions en/sql/tuning.rst
Original file line number Diff line number Diff line change
Expand Up @@ -679,6 +679,7 @@ Using hints can affect the performance of query execution. You can allow the que
NO_COVERING_IDX |
NO_MULTI_RANGE_OPT |
NO_SORT_LIMIT |
NO_SUBQUERY_CACHE |
NO_PUSH_PRED |
NO_MERGE |
NO_ELIMINATE_JOIN |
Expand Down Expand Up @@ -724,6 +725,7 @@ The following hints can be specified in **UPDATE**, **DELETE** and **SELECT** st
* **NO_COVERING_IDX**: This is a hint not to use the covering index. For details, see :ref:`covering-index`.
* **NO_MULTI_RANGE_OPT**: This is a hint not to use the multi-key range optimization. For details, see :ref:`multi-key-range-opt`.
* **NO_SORT_LIMIT**: This is a hint not to use the SORT-LIMIT optimization. For more details, see :ref:`sort-limit-optimization`.
* **NO_SUBQUERY_CACHE**: This is a hint not to use the SUBQUERY CACHE optimization. For more details, see :ref:`correlated-subquery-cache`.
* **NO_PUSH_PRED**: This is a hint not to use the PREDICATE-PUSH optimization.
* **NO_MERGE**: This is a hint not to use the VIEW-MERGE optimization.
* **NO_ELIMINATE_JOIN**: This is a hint not to use join elimination optimization. For more details, see :ref:`join-elimination-optimization`.
Expand Down Expand Up @@ -4189,3 +4191,193 @@ The user can check the query to be cached or not by putting the session command
}

The cached query is shown as **query_string** in the middle of the result screen. Each of the **n_entries** and **n_pages** represents the number of cached queries and the number of pages in the cached results. The **n_entries** is limited to the value of configuration parameter **max_query_cache_entries** and the **n_pages** is limited to the value of **query_cache_size_in_pages**. If the **n_entries** is overflown or the **n_pages** is overflown, some victims among the cache entries are selected and they are uncached. The number of victims is about 20% of **max_query_cache_entries** value and of the **query_cache_size_in_pages** value.

.. _correlated-subquery-cache:

SUBQUERY CACHE (correlated)
------------------------------------

Subquery cache optimization can be used to enhance the performance of queries containing correlated subqueries, and the results of the subqueries are cached in independent spaces for each subquery.
To disable subquery cache optimization, use the NO_SUBQUERY_CACHE hint on the target subquery.

If the correlated subquery is in the SELECT clause, subquery cache is utilized.
Among the recurrently executed correlated subqueries, if the column values referenced in the main query remain the same with previously cached values, the cached results are used to prevent re-execution.
If no cached value is found, the subquery is executed and its results, along with the column values and query results, are cached.
xmilex-git marked this conversation as resolved.
Show resolved Hide resolved
If the same column values are found in the cache, the results are retrieved from the cached area.

The following example measures the performance difference depending on whether the subquery cache is used or not.
First, a query to prepare the data for measuring performance differences is written:

::

# Prepare data
csql> DROP TABLE IF EXISTS t1;

csql> CREATE TABLE t1 AS
SELECT
ROWNUM AS t1_pk,
MOD(ROWNUM, 10) AS c1,
MOD(ROWNUM, 100) AS c2,
MOD(ROWNUM, 1000) AS c3
FROM
db_class a, db_class b, db_class c, db_class d, db_class e, db_class f
LIMIT 100000;

csql> ALTER TABLE t1 ADD CONSTRAINT PRIMARY KEY pk_t1 (t1_pk);

csql> CREATE TABLE t2 AS
SELECT
ROWNUM as c1,
1 as c2,
TO_CHAR(ROWNUM * 1000, '0999') as code
FROM
db_class a, db_class b
LIMIT 10;

csql> update statistics on t1 with fullscan;

csql> ;trace on

In CSQL, the improved performance can be easily measured by repeatedly executing queries using the COUNT function as shown in the example below.
The results of the first example might be slow as the cache is not activated using the **NO_SUBQUERY_CACHE** hint, but from the second example, it becomes much faster because it retrieves from the cached area: ::

# Target query #1
csql> SELECT COUNT(*)
FROM (
SELECT /*+ RECOMPILE NO_MERGE */
(SELECT /*+ NO_SUBQUERY_CACHE */ t1_pk FROM t1 b WHERE b.t1_pk = a.c3)
FROM t1 a
WHERE a.c2 >= 1
);

=== <Result of SELECT Command in Line 2> ===

count(*)
======================
99000

1 row selected. (0.626199 sec) Committed. (0.000011 sec)

1 command(s) successfully processed.

Trace Statistics:
SELECT (time: 621, fetch: 397811, fetch_time: 56, ioread: 0)
SCAN (temp time: 7, fetch: 142, ioread: 0, readrows: 99000, rows: 99000)
SUBQUERY (uncorrelated)
SELECT (time: 607, fetch: 397669, fetch_time: 56, ioread: 0)
SCAN (table: dba.t1), (heap time: 98, fetch: 100384, ioread: 0, readrows: 100000, rows: 99000)
SUBQUERY (correlated)
SELECT (time: 460, fetch: 297000, fetch_time: 0, ioread: 0)
SCAN (index: dba.t1.pk_t1), (btree time: 243, fetch: 198000, ioread: 0, readkeys: 99000, filteredkeys: 0, rows: 99000, covered: true)

When SQL trace is queried, trace information about the subquery cache for the relevant subquery is displayed.

The following example displays trace information for the subquery cache in a case where the subquery cache is enabled:

::

csql> SELECT COUNT(*)
FROM (
SELECT /*+ RECOMPILE NO_MERGE */
(SELECT t1_pk FROM t1 b WHERE b.t1_pk = a.c3)
FROM t1 a
WHERE a.c2 >= 1
);

=== <Result of SELECT Command in Line 6> ===

count(*)
======================
99000

1 row selected. (0.128251 sec) Committed. (0.000010 sec)

1 command(s) successfully processed.

Trace Statistics:
SELECT (time: 122, fetch: 103781, fetch_time: 13, ioread: 0)
SCAN (temp time: 7, fetch: 142, ioread: 0, readrows: 99000, rows: 99000)
SUBQUERY (uncorrelated)
SELECT (time: 108, fetch: 103639, fetch_time: 13, ioread: 0)
SCAN (table: dba.t1), (heap time: 70, fetch: 100384, ioread: 0, readrows: 100000, rows: 99000)
SUBQUERY (correlated)
SELECT (time: 4, fetch: 2970, fetch_time: 0, ioread: 0)
SCAN (index: dba.t1.pk_t1), (btree time: 2, fetch: 1980, ioread: 0, readkeys: 990, filteredkeys: 0, rows: 990, covered: true)
SUBQUERY_CACHE (hit: 98010, miss: 990, size: 269384, status: enabled)

Descriptions for each item are as follows:

* **hit**: The number of times results were retrieved from the cached area instead of executing the query.
* **miss**: The number of times results were cached after executing the query.
* **size**: The memory size used by the subquery cache.
* **status**: The activation status of the subquery cache at the end of the query.

If **size** exceeds the set value, the subquery cache is disabled during the execution of the query, and the SQL trace information shows **status** as disabled.
Additionally, if the ratio of **miss** to **hit** is higher than 9, even if the subquery cache size does not exceed the set value, it may be disabled during the execution of the query.

Subquery cache does not operate in the following scenarios:
xmilex-git marked this conversation as resolved.
Show resolved Hide resolved

* When the correlated subquery contains another correlated subquery.
* When the subquery is not in the SELECT clause.
* When the subquery includes CONNECT BY clause.
xmilex-git marked this conversation as resolved.
Show resolved Hide resolved
* When the subquery includes OID-related features.
* When the subquery includes the NO_SUBQUERY_CACHE hint.
* When storing new results exceeds the set subquery cache size (default: 2MB).
xmilex-git marked this conversation as resolved.
Show resolved Hide resolved
* When the subquery contains functions that change results with each execution, such as random() or sys_guid().
xmilex-git marked this conversation as resolved.
Show resolved Hide resolved

Subquery cache is disabled if the correlated subquery contains another correlated subquery.
However, if the included correlated subquery does not contain another correlated subquery, it is enabled.

The following example shows a case where a correlated subquery contains another correlated subquery:

::

csql> SELECT /*+ recompile */
(
SELECT
(
SELECT c.code
FROM t2 c
WHERE c.c1 = b.c1
)
FROM t1 b
WHERE b.t1_pk = a.c1
) s
FROM t1 a
WHERE a.c3 = 1;

Trace Statistics:
SELECT (time: 56, fetch: 100785, fetch_time: 10, ioread: 0)
SCAN (table: dba.t1), (heap time: 55, fetch: 100384, ioread: 0, readrows: 100000, rows: 100)
SUBQUERY (correlated)
SELECT (time: 0, fetch: 401, fetch_time: 0, ioread: 0)
SCAN (index: dba.t1.pk_t1), (btree time: 0, fetch: 300, ioread: 0, readkeys: 100, filteredkeys: 0, rows: 100) (lookup time: 0, rows: 100)
SUBQUERY (correlated)
SELECT (time: 0, fetch: 1, fetch_time: 0, ioread: 0)
SCAN (table: dba.t2), (heap time: 0, fetch: 1, ioread: 0, readrows: 10, rows: 1)
SUBQUERY_CACHE (hit: 99, miss: 1, size: 150704, status: enabled)

Moreover, subquery cache is disabled in a correlated subquery that includes functions like random (), sys_guid () that produce different results each time they are executed.

The following example shows a case where a correlated subquery includes random ():

::

csql> WITH cte_1 AS
(SELECT
DISTINCT (SELECT random(1) FROM t2 b WHERE b.c1 = a.c1 AND b.c2 = 1) v
FROM t1 a
WHERE a.c2 = 1
) SELECT count(*) FROM cte_1;

Trace Statistics:
SELECT (time: 65, fetch: 101384, fetch_time: 9, ioread: 0)
SCAN (temp time: 0, fetch: 0, ioread: 0, readrows: 1000, rows: 1000)
SUBQUERY (uncorrelated)
CTE (non_recursive_part)
SELECT (time: 65, fetch: 101384, fetch_time: 9, ioread: 0)
SCAN (table: dba.t1), (heap time: 59, fetch: 100384, ioread: 0, readrows: 100000, rows: 1000)
ORDERBY (time: 0, sort: true, page: 0, ioread: 0)
SUBQUERY (correlated)
SELECT (time: 4, fetch: 1000, fetch_time: 0, ioread: 0)
SCAN (table: dba.t2), (heap time: 3, fetch: 1000, ioread: 0, readrows: 10000, rows: 1000)
10 changes: 10 additions & 0 deletions ko/admin/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,8 @@ CUBRID는 데이터베이스 서버, 브로커, CUBRID 매니저로 구성된다
| +-------------------------------------+-------------------------+---------+----------+--------------------------------+-----------------+
| | max_hash_list_scan_size | 서버 | | byte | 8,388,608(8M) | |
| +-------------------------------------+-------------------------+---------+----------+--------------------------------+-----------------+
| | max_subquery_cache_size | 서버 | | byte | 2,097,152(2M) | DBA만 가능 |
| +-------------------------------------+-------------------------+---------+----------+--------------------------------+-----------------+
| | sort_buffer_size | 서버 | | byte | 128 * | |
| | | | | | :ref:`db_page_size <dpg>` | |
| +-------------------------------------+-------------------------+---------+----------+--------------------------------+-----------------+
Expand Down Expand Up @@ -673,6 +675,8 @@ CUBRID 설치 시 생성되는 기본 데이터베이스 환경 설정 파일(**
+--------------------------------+--------+---------------------------+---------------------------+---------------------------+
| max_hash_list_scan_size | byte | 8,388,608(8M) | 0 | 128MB |
+--------------------------------+--------+---------------------------+---------------------------+---------------------------+
| max_subquery_cache_size | byte | 2,097,152(2M) | 0 | 16,777,216(16M) |
+--------------------------------+--------+---------------------------+---------------------------+---------------------------+
| sort_buffer_size | byte | 128 * | 1 * | 2G(32비트), |
| | | :ref:`db_page_size <dpg>` | :ref:`db_page_size <dpg>` | INT_MAX * |
| | | | | :ref:`db_page_size <dpg>` |
Expand Down Expand Up @@ -713,6 +717,12 @@ CUBRID 설치 시 생성되는 기본 데이터베이스 환경 설정 파일(**

**max_hash_list_scan_size**\이 0으로 설정되거나, :ref:`NO_HASH_LIST_SCAN <no-hash-list-scan>` 힌트가 명시되면, 조회 작업 시 해싱 방식이 사용되지 않을 것이다.

**max_subquery_cache_size**

**max_subquery_cache_size**\는 서브 쿼리 캐시의 크기를 설정하기 위한 파라미터이다. 값 뒤에 B, K, M, G, T로 단위를 붙일 수 있으며, 각각 Bytes, Kilobytes, Megabytes, Gigabytes, Terabytes를 의미한다. 단위를 생략하면 바이트 단위가 적용된다. 기본값은 **2,097,152** (2M) 바이트, 최소값은 **0**, 그리고 최대값은 **16,777,216** (16M) 바이트이다. 서브 쿼리 캐시는 질의의 서브 쿼리 개수만큼 할당되며, 주질의가 종료될 때 할당 해제된다.
xmilex-git marked this conversation as resolved.
Show resolved Hide resolved

**max_subquery_cache_size**\이 0으로 설정되거나, :ref:`NO_SUBQUERY_CACHE <correlated-subquery-cache>` 힌트가 명시되거나, 저장공간이 부족한 경우 서브 쿼리 캐시가 사용되지 않는다.

**sort_buffer_size**

**sort_buffer_size**\ 는 정렬을 수행하는 질의에서 사용되는 버퍼의 크기를 설정하기 위한 파라미터이다. 서버는 각 클라이언트의 정렬 요청마다 하나의 정렬 버퍼를 할당하며, 정렬을 완료한 후에는 할당되었던 버퍼 메모리를 해제한다. 정렬을 수행하는 질의로는 SELECT 정렬 질의 뿐만 아니라 인덱스 생성 질의도 포함된다.
Expand Down
3 changes: 1 addition & 2 deletions ko/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -163,8 +163,7 @@
html_static_path = ['_static']

def setup(app):
#app.add_css_file('style.css')
app.add_stylesheet('style.css')
app.add_css_file('style.css')
# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
# using the given strftime format.
html_last_updated_fmt = '%b %d, %Y'
Expand Down
Loading
Loading