Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify the syntax for index creation #5332

Open
killme2008 opened this issue Jan 10, 2025 · 1 comment
Open

Unify the syntax for index creation #5332

killme2008 opened this issue Jan 10, 2025 · 1 comment
Labels
breaking-change This pull request contains breaking changes. C-enhancement Category Enhancements

Comments

@killme2008
Copy link
Contributor

killme2008 commented Jan 10, 2025

What type of enhancement is this?

API improvement

What does the enhancement do?

  • Create an inverted index explicitly:
CREATE TABLE monitoring_data (
    host STRING,
    region STRING PRIMARY KEY,
    cpu_usage DOUBLE,
    ts TIMESTAMP TIME INDEX,
    INVERTED INDEX(host, region)
);
  • Create a skipping index:
CREATE TABLE sensor_data (
    domain STRING PRIMARY KEY,
    device_id STRING SKIPPING INDEX,
    temperature DOUBLE,
    `timestamp` TIMESTAMP TIME INDEX
);
  • Create a full-text index:
CREATE TABLE logs (
    message STRING FULLTEXT,
    `level` STRING PRIMARY KEY,
    `timestamp` TIMESTAMP TIME INDEX
);

Full-text and skipping indexes can only be created using column options, while inverted indexes require INVERTED INDEX(col1, col2...).

We should standardize index creation:

  • Allow inverted indexes to be created using column options.
  • Support INDEX [ INVERTED | FULLTEXT | SKIPPING](col1 options, col2 options, col3 options).

Implementation challenges

No response

@killme2008 killme2008 added the C-enhancement Category Enhancements label Jan 10, 2025
@waynexia waynexia added the breaking-change This pull request contains breaking changes. label Jan 10, 2025
@waynexia
Copy link
Member

I propose to support index at two positions.

  • Column constrain
<COLUMN_NAME> <COLUMN_TYPE> [OTHER_CONSTRAINS] [INDEX_TYPE INDEX [WITH (key = "value")]]

E.g.:

`level` STRING PRIMARY KEY SKIPPING INDEX WITH (option = 'a'),
  • Table constrain
<INDEX_TYPE> INDEX (<COLUMN_LIST) [WITH (key = "value)]

E.g.:

CREATE TABLE logs (
    message STRING,
    `level` STRING PRIMARY KEY,
    `timestamp` TIMESTAMP TIME INDEX,
+    FULLTEXT INDEX (`message`) WITH (option = 'a')
);

Notable changes alongside the grammar:

  • We should enforce the word INDEX on creating index, like FULLTEXT INDEX instead of just FULLTEXT
  • We can't exclude an index, so I'd like to drop the logic of creating INVERTED INDEX for all tags by default

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-change This pull request contains breaking changes. C-enhancement Category Enhancements
Projects
None yet
Development

No branches or pull requests

2 participants