Skip to content

Commit

Permalink
layout cleanup
Browse files Browse the repository at this point in the history
  • Loading branch information
leanneeliatra committed Oct 9, 2024
1 parent c99229c commit ecb5e5f
Showing 1 changed file with 6 additions and 6 deletions.
12 changes: 6 additions & 6 deletions _analyzers/tokenizers/character-group-tokenizer.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@ The Character Group Tokenizer accepts the following parameters:
4. `max_token_length`: This parameter defines the maximum length allowed for a token. If a token exceeds this specified length, it will be split at intervals defined by `max_token_length`. The default value is `255`.

## Example of the character group tokenizer

We can tokenize the on characters such as `whitespace`, `-` and `:`.

```
POST _analyze
{
Expand All @@ -30,12 +33,9 @@ POST _analyze
"text": "Fast-cars: drive fast!"
}
```
Summary of the outputted response text:

By analyzing the text "Fast-cars: drive fast!", we can see the specified characters have been removed:

```
Fast cars drive fast
```





0 comments on commit ecb5e5f

Please sign in to comment.