Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] lte of text fields is broken #16892

Open
roopeux opened this issue Dec 21, 2024 · 1 comment
Open

[BUG] lte of text fields is broken #16892

roopeux opened this issue Dec 21, 2024 · 1 comment
Labels
bug Something isn't working Other untriaged

Comments

@roopeux
Copy link

roopeux commented Dec 21, 2024

Describe the bug

lte in queries like this matches any field that has any numbers

{
  "query": {
    "range": {
      "test": {
        "lte": 700
      }
    }
  }
}

Related component

Other

To Reproduce

# Delete test index
DELETE lte_test

# Create test documents
POST lte_test/_doc/1
{
  "test": "500"
}

POST lte_test/_doc/2
{
  "test": "1000" 
}

POST lte_test/_doc/3
{
  "test": "foo" 
}

POST lte_test/_doc/4
{
  "test": "1000 foo" 
}

# Force refresh
POST lte_test/_refresh

# Test range query
GET lte_test/_search
{
  "query": {
    "range": {
      "test": {
        "lte": 700
      }
    }
  }
}

This will match any test field that has any number

{
  "took": 463,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 3,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "date_test",
        "_id": "1",
        "_score": 1,
        "_source": {
          "test": "500"
        }
      },
      {
        "_index": "date_test",
        "_id": "2",
        "_score": 1,
        "_source": {
          "test": "1000"
        }
      },
      {
        "_index": "date_test",
        "_id": "4",
        "_score": 1,
        "_source": {
          "test": "1000 foo"
        }
      }
    ]
  }
}

Expected behavior

Return doc 1 since 500 is less 700.
Should not return doc 2 since 1000 is not less than 700.
Should not return doc 4, since 1000 is not less than 700 and maybe because of mixed content.

Additional Details

There is a similar bug in aggs, where range with lte or gte matches everything

@roopeux roopeux added bug Something isn't working untriaged labels Dec 21, 2024
@github-actions github-actions bot added the Other label Dec 21, 2024
@gaobinlong
Copy link
Collaborator

If you didn't set the mapping of the field test to numeric type like long or integer, then the field will be mapped into text type plus a keyword type subfield, so at this point, performing range query will check the lexicographic order.

Performing range query on text or keyword field is not recommended, you should explicitly set the mapping before indexing documents into OpenSearch:

PUT test1/_mapping
{
  "properties":{
    "test":{
      "type":"integer"
    }
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Other untriaged
Projects
None yet
Development

No branches or pull requests

2 participants