Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

branch-3.0: [feat] (inverted index) show index file size #44120 #44851

Merged
merged 1 commit into from
Dec 3, 2024

Conversation

github-actions[bot]
Copy link
Contributor

@github-actions github-actions bot commented Dec 2, 2024

Cherry-picked from #44120

### What problem does this PR solve?

Problem Summary:

1. Show the data in detail, including the size of the inverted index
file and the data file.

```
mysql > show data all;

+------------------------+-----------+--------------+-------------+-----------------+-----------------+-----------------+-----------------+
| TableName              | ReplicaCount      | LocalTotalSize| LocalDataSize  | LocalIndexSize | RemoteTotalSize| RemoteDataSize | RemoteIndexSize |
+------------------------+-----------+--------------+-------------+-----------------+-----------------+-----------------+-----------------+
| test_show_index_data_p2 |  1 | 291.534 MB          | 133.697 MB      | 157.837 MB       | 0.000      | 0.000           | 0.000           |
| Total                  | 1 | 291.534 MB          | 133.697 MB       | 157.837 MB       |   0.000      | 0.000           | 0.000           |
| Quota                  | 1024.000 TB | 1073741824  |             |                  |                 |                 |                 |
| Left                   | 1024.000 TB | 1073741823  |             |                  |                 |                 |                 |
+------------------------+-----------+--------------+-------------+-----------------+-----------------+-----------------+-----------------+
4 rows in set (0.00 sec)
```

```
msql> show data all from test_show_index_data_p2;
+------------------------+------------------------+-----------+--------------+----------+-------------+-----------------+-----------------+-----------------+-----------------+
| TableName              | IndexName              | ReplicaCount      |  RowCount| LocalTotalSize | LocalDataSize  | LocalIndexSize | RemoteTotalSize | RemoteDataSize | RemoteIndexSize |
+------------------------+------------------------+-----------+--------------+----------+-------------+-----------------+-----------------+-----------------+-----------------+
| test_show_index_data_p2 | test_show_index_data_p2 | 1 | 19697882          | 291.534 MB | 157.837 MB       |     133.697 MN    | 0.000       | 0.000           | 0.000           |
|                        | Total                  | 1 |            |    291.534 MB      | 133.697 MB       | 157.837 MB       |   0.000    | 0.000           | 0.000           |
+------------------------+------------------------+-----------+--------------+----------+-------------+-----------------+-----------------+-----------------+-----------------+
2 rows in set (0.00 sec)
```

2. It is possible to obtain the sizes of data and index files by
querying the system tables

```
msyql > select * from information_schema.tables where TABLE_NAME = "test_show_index_data_p2";
+-------------------------+-------------------------------+---------------------+----------------+-------------+-------------+-------------+------------+------------+---------------------+---------------------+------------+---------+------------+---------------+----------------+
| TABLE_CATALOG           | TABLE_SCHEMA                                    | TABLE_NAME          | TABLE_TYPE     | ENGINE      | VERSION     | ROW_FORMAT  | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH | MAX_DATA_LENGTH | INDEX_LENGTH | DATA_FREE | AUTO_INCREMENT | CREATE_TIME          | UPDATE_TIME          | CHECK_TIME | TABLE_COLLATION | CHECKSUM | CREATE_OPTIONS | TABLE_COMMENT |
+-------------------------+-------------------------------+---------------------+----------------+-------------+-------------+-------------+------------+--------------+-------------+----------------+--------------+-----------+---------------+---------------------+---------------------+------------+----------------+---------+---------------+---------------+
| internal                | regression_test_inverted_index_p2_show_data | test_show_index_data_p2 | BASE TABLE  | Doris       | NULL        | NULL        | 19697882    | 15            | 140191631   | NULL          | 165504277   | NULL      | NULL          | 2024-11-18 15:22:32 | 2024-11-18 15:24:52 | NULL       | utf-8           | NULL     | NULL          |               |
+-------------------------+-------------------------------+---------------------+----------------+-------------+-------------+-------------+------------+--------------+-------------+----------------+--------------+-----------+---------------+---------------------+---------------------+------------+----------------+---------+---------------+---------------+
1 row in set (0.02 sec)

```

### Release note

1. Added `show data all;` to retrieve the detailed file size.
2. Fixed the semantics of `DATA_LENGTH` and `INDEX_LENGTH` in the system
`table information_schema.tables`.
@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring closed this Dec 2, 2024
@dataroaring dataroaring reopened this Dec 2, 2024
@doris-robot
Copy link

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40862 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 0c0084b3f72f7bafd82444c7c50daf50f5dca3cf, data reload: false

------ Round 1 ----------------------------------
q1	17594	7500	7444	7444
q2	2072	200	174	174
q3	10670	1110	1178	1110
q4	10549	750	659	659
q5	7765	2874	2827	2827
q6	236	146	142	142
q7	975	602	602	602
q8	9344	1919	2044	1919
q9	6636	6430	6467	6430
q10	6954	2354	2300	2300
q11	451	271	263	263
q12	401	218	216	216
q13	17785	3009	2987	2987
q14	240	212	207	207
q15	563	523	521	521
q16	676	604	599	599
q17	986	597	578	578
q18	7322	6595	6574	6574
q19	1384	1075	1069	1069
q20	490	201	192	192
q21	4004	3277	3070	3070
q22	1093	1007	979	979
Total cold run time: 108190 ms
Total hot run time: 40862 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7352	7349	7322	7322
q2	334	237	234	234
q3	2912	2914	2893	2893
q4	2007	1808	1847	1808
q5	5713	5684	5711	5684
q6	227	139	137	137
q7	2457	1742	1794	1742
q8	3313	3514	3529	3514
q9	8786	9081	8876	8876
q10	3549	3489	3545	3489
q11	615	505	506	505
q12	788	605	570	570
q13	15319	3133	3122	3122
q14	320	273	289	273
q15	560	529	538	529
q16	717	671	693	671
q17	1866	1636	1619	1619
q18	8278	7724	7497	7497
q19	7587	1647	1584	1584
q20	2068	1860	1829	1829
q21	5397	5248	5283	5248
q22	1105	1015	997	997
Total cold run time: 81270 ms
Total hot run time: 60143 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 196706 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 0c0084b3f72f7bafd82444c7c50daf50f5dca3cf, data reload: false

query1	1279	944	900	900
query2	6251	2049	2077	2049
query3	10970	4174	4169	4169
query4	67417	28912	23514	23514
query5	5181	440	468	440
query6	446	186	182	182
query7	5661	327	302	302
query8	320	227	224	224
query9	9273	2721	2708	2708
query10	486	274	264	264
query11	17646	15440	15657	15440
query12	165	105	117	105
query13	1559	413	408	408
query14	10570	7328	7160	7160
query15	224	175	177	175
query16	7277	486	469	469
query17	1400	599	576	576
query18	1877	329	312	312
query19	227	168	159	159
query20	117	119	115	115
query21	214	109	109	109
query22	4501	4524	4386	4386
query23	34718	34181	34177	34177
query24	6040	2868	2908	2868
query25	552	433	421	421
query26	678	179	174	174
query27	2101	304	304	304
query28	4007	2585	2548	2548
query29	692	503	452	452
query30	262	173	174	173
query31	978	833	876	833
query32	66	62	67	62
query33	464	290	288	288
query34	886	504	498	498
query35	850	742	746	742
query36	1109	954	967	954
query37	118	80	76	76
query38	4116	4019	4019	4019
query39	1527	1461	1456	1456
query40	210	102	105	102
query41	49	47	47	47
query42	111	103	99	99
query43	520	483	492	483
query44	1142	802	791	791
query45	184	178	169	169
query46	1142	725	713	713
query47	2006	1921	1913	1913
query48	474	373	377	373
query49	717	385	393	385
query50	833	416	420	416
query51	7402	7215	7028	7028
query52	94	94	85	85
query53	257	181	188	181
query54	538	455	443	443
query55	75	74	78	74
query56	240	237	234	234
query57	1181	1116	1095	1095
query58	200	207	199	199
query59	3099	3018	2821	2821
query60	265	248	242	242
query61	106	105	106	105
query62	756	652	651	651
query63	205	185	189	185
query64	1480	647	615	615
query65	3299	3129	3158	3129
query66	717	287	293	287
query67	15835	15476	15448	15448
query68	5206	541	547	541
query69	496	246	259	246
query70	1192	1133	1143	1133
query71	498	253	249	249
query72	6554	3837	3891	3837
query73	769	335	339	335
query74	10030	9020	8874	8874
query75	3738	2604	2647	2604
query76	3187	1157	1097	1097
query77	530	263	262	262
query78	10741	9678	9587	9587
query79	7586	573	583	573
query80	2007	448	416	416
query81	561	247	239	239
query82	1703	115	112	112
query83	261	144	138	138
query84	298	79	79	79
query85	1275	296	289	289
query86	475	289	297	289
query87	4555	4232	4271	4232
query88	5107	2414	2436	2414
query89	446	294	292	292
query90	2150	190	184	184
query91	179	144	145	144
query92	68	57	50	50
query93	6451	538	530	530
query94	827	301	302	301
query95	356	244	258	244
query96	629	282	290	282
query97	3310	3137	3232	3137
query98	223	196	196	196
query99	1718	1288	1276	1276
Total cold run time: 338488 ms
Total hot run time: 196706 ms

Copy link
Member

@airborne12 airborne12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@airborne12 airborne12 merged commit 74e085c into branch-3.0 Dec 3, 2024
19 of 23 checks passed
@airborne12 airborne12 deleted the auto-pick-44120-branch-3.0 branch December 3, 2024 10:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants