Skip to content

Commit

Permalink
Merge branch 'main' into qa_docs
Browse files Browse the repository at this point in the history
  • Loading branch information
kolchfa-aws authored Mar 25, 2024
2 parents fd85a74 + 88242fa commit 5755c4d
Show file tree
Hide file tree
Showing 30 changed files with 1,102 additions and 66 deletions.
3 changes: 3 additions & 0 deletions .github/vale/styles/Vocab/OpenSearch/Words/accept.txt
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ Boolean
Dev
[Dd]iscoverability
Distro
[Dd]ownvote(s|d)?
[Dd]uplicative
[Ee]gress
[Ee]num
Expand Down Expand Up @@ -122,6 +123,7 @@ stdout
[Ss]ubvector
[Ss]ubwords?
[Ss]uperset
[Ss]yslog
tebibyte
[Tt]emplated
[Tt]okenization
Expand All @@ -138,6 +140,7 @@ tebibyte
[Uu]nregister(s|ed|ing)?
[Uu]pdatable
[Uu]psert
[Uu]pvote(s|d)?
[Ww]alkthrough
[Ww]ebpage
xy
10 changes: 10 additions & 0 deletions TERMS.md
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,8 @@ Do not use *disable* to refer to users.

Always hyphenated. Don’t use _double click_.

**downvote**

**dropdown list**

**due to**
Expand Down Expand Up @@ -586,6 +588,10 @@ Use % in headlines, quotations, and tables or in technical copy.

An agent and REST API that allows you to query numerous performance metrics for your cluster, including aggregations of those metrics, independent of the Java Virtual Machine (JVM).

**plaintext, plain text**

Use *plaintext* only to refer to nonencrypted or decrypted text in content about encryption. Use *plain text* to refer to ASCII files.

**please**

Avoid using except in quoted text.
Expand Down Expand Up @@ -700,6 +706,8 @@ Never hyphenated. Use _startup_ as a noun (for example, “The following startup

**Stochastic Gradient Descent (SGD)**

**syslog**

## T

**term frequency–inverse document frequency (TF–IDF)**
Expand Down Expand Up @@ -746,6 +754,8 @@ A storage tier that you can use to store and analyze your data with Elasticsearc

Hyphenate as adjectives. Use instead of *top left* and *top right*, unless the field name uses *top*. For example, "The upper-right corner."

**upvote**

**US**

No periods, as specified in the Chicago Manual of Style.
Expand Down
8 changes: 8 additions & 0 deletions _api-reference/index-apis/force-merge.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ The following table lists the available query parameters. All query parameters a
| `ignore_unavailable` | Boolean | If `true`, OpenSearch ignores missing or closed indexes. If `false`, OpenSearch returns an error if the force merge operation encounters missing or closed indexes. Default is `false`. |
| `max_num_segments` | Integer | The number of larger segments into which smaller segments are merged. Set this parameter to `1` to merge all segments into one segment. The default behavior is to perform the merge as necessary. |
| `only_expunge_deletes` | Boolean | If `true`, the merge operation only expunges segments containing a certain percentage of deleted documents. The percentage is 10% by default and is configurable in the `index.merge.policy.expunge_deletes_allowed` setting. Prior to OpenSearch 2.12, `only_expunge_deletes` ignored the `index.merge.policy.max_merged_segment` setting. Starting with OpenSearch 2.12, using `only_expunge_deletes` does not produce segments larger than `index.merge.policy.max_merged_segment` (by default, 5 GB). For more information, see [Deleted documents](#deleted-documents). Default is `false`. |
| `primary_only` | Boolean | If set to `true`, then the merge operation is performed only on the primary shards of an index. This can be useful when you want to take a snapshot of the index after the merge is complete. Snapshots only copy segments from the primary shards. Merging the primary shards can reduce resource consumption. Default is `false`. |

#### Example request: Force merge a specific index

Expand Down Expand Up @@ -101,6 +102,13 @@ POST /.testindex-logs/_forcemerge?max_num_segments=1
```
{% include copy-curl.html %}

#### Example request: Force merge primary shards

```json
POST /.testindex-logs/_forcemerge?primary_only=true
```
{% include copy-curl.html %}

#### Example response

```json
Expand Down
30 changes: 23 additions & 7 deletions _automating-configurations/api/get-workflow-steps.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ nav_order: 50

# Get workflow steps

OpenSearch validates workflows by using the validation template that lists the required inputs, generated outputs, and required plugins for all steps. For example, for the `register_remote_model` step, the validation template appears as follows:
This API returns a list of workflow steps, including their required inputs, outputs, default timeout values, and required plugins. For example, for the `register_remote_model` step, the Get Workflow Steps API returns the following information:

```json
{
Expand All @@ -25,36 +25,52 @@ OpenSearch validates workflows by using the validation template that lists the r
]
}
}
```

The Get Workflow Steps API retrieves this file.
```

## Path and HTTP methods

```json
GET /_plugins/_flow_framework/workflow/_steps
GET /_plugins/_flow_framework/workflow/_step?workflow_step=<step_name>
```

## Query parameters

The following table lists the available query parameters. All query parameters are optional.

| Parameter | Data type | Description |
| :--- | :--- | :--- |
| `workflow_step` | String | The name of the step to retrieve. Specify multiple step names as a comma-separated list. For example, `create_connector,delete_model,deploy_model`. |

#### Example request

To fetch all workflow steps, use the following request:

```json
GET /_plugins/_flow_framework/workflow/_steps
```
{% include copy-curl.html %}

To fetch specific workflow steps, pass the step names to the request as a query parameter:

```json
GET /_plugins/_flow_framework/workflow/_step?workflow_step=create_connector,delete_model,deploy_model
```
{% include copy-curl.html %}


#### Example response

OpenSearch responds with the validation template containing the steps. The order of fields in the returned steps may not exactly match the original JSON but will function identically.
OpenSearch responds with the workflow steps. The order of fields in the returned steps may not exactly match the original JSON but will function identically.

To retrieve the template in YAML format, specify `Content-Type: application/yaml` in the request header:

```bash
curl -XGET "http://localhost:9200/_plugins/_flow_framework/workflow/8xL8bowB8y25Tqfenm50" -H 'Content-Type: application/yaml'
curl -XGET "http://localhost:9200/_plugins/_flow_framework/workflow/_steps" -H 'Content-Type: application/yaml'
```

To retrieve the template in JSON format, specify `Content-Type: application/json` in the request header:

```bash
curl -XGET "http://localhost:9200/_plugins/_flow_framework/workflow/8xL8bowB8y25Tqfenm50" -H 'Content-Type: application/json'
curl -XGET "http://localhost:9200/_plugins/_flow_framework/workflow/_steps" -H 'Content-Type: application/json'
```
3 changes: 3 additions & 0 deletions _automating-configurations/workflow-steps.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,9 @@ The following table lists the workflow step types. The `user_inputs` fields for
|`register_agent` |[Register Agent API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/) |Registers an agent as part of the ML Commons Agent Framework. |
|`delete_agent` |[Delete Agent API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/) |Deletes an agent. |
|`create_tool` |No API | A special-case non-API step encapsulating the specification of a tool for an agent in the ML Commons Agent Framework. These will be listed as `previous_node_inputs` for the appropriate register agent step, with the value set to `tools`. |
|`create_index`|[Create Index]({{site.url}}{{site.baseurl}}/api-reference/index-apis/create-index/) | Creates a new OpenSearch index. The inputs include `index_name`, which should be the name of the index to be created, and `configurations`, which contains the payload body of a regular REST request for creating an index.
|`create_ingest_pipeline`|[Create Ingest Pipeline]({{site.url}}{{site.baseurl}}/ingest-pipelines/create-ingest/) | Creates or updates an ingest pipeline. The inputs include `pipeline_id`, which should be the ID of the pipeline, and `configurations`, which contains the payload body of a regular REST request for creating an ingest pipeline.
|`create_search_pipeline`|[Create Search Pipeline]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/creating-search-pipeline/) | Creates or updates a search pipeline. The inputs include `pipeline_id`, which should be the ID of the pipeline, and `configurations`, which contains the payload body of a regular REST request for creating a search pipeline.

## Additional fields

Expand Down
2 changes: 1 addition & 1 deletion _dashboards/dashboards-assistant/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ For information about configuring OpenSearch Assistant through the REST API, see

## Using OpenSearch Assistant in OpenSearch Dashboards

The following tutorials guide you through using OpenSearch Assistant in OpenSearch Dashboards. OpenSearch Assistant can be viewed full frame or in the right sidebar. The default is sidebar. To view full frame, select the frame icon {::nomarkdown}<img src="{{site.url}}{{site.baseurl}}/images/icons/frame-icon.png" class="inline-icon" alt="frame icon"/>{:/} in the toolbar.
The following tutorials guide you through using OpenSearch Assistant in OpenSearch Dashboards. OpenSearch Assistant can be viewed in full frame or in the sidebar. The default view is in the right sidebar. To view the assistant in the left sidebar or in full frame, select the {::nomarkdown}<img src="{{site.url}}{{site.baseurl}}/images/icons/frame-icon.png" class="inline-icon" alt="frame icon"/>{:/} icon in the toolbar and choose the preferred option.

### Start a conversation

Expand Down
2 changes: 1 addition & 1 deletion _dashboards/management/index-patterns.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ An example of step 1 is shown in the following image. Note that the index patter

Once the index pattern has been created, you can view the mapping of the matching indexes. Within the table, you can see the list of fields, along with their data type and properties. An example is shown in the following image.

<img src="{{site.url}}{{site.baseurl}}//images/dashboards/index-pattern-table.png" alt="Index pattern table UI " width="700"/>
<img src="{{site.url}}{{site.baseurl}}/images/dashboards/index-pattern-table.png" alt="Index pattern table UI " width="700"/>

## Next steps

Expand Down
12 changes: 6 additions & 6 deletions _data-prepper/common-use-cases/trace-analytics.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,9 +38,9 @@ The [OpenTelemetry source]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/c

There are three processors for the trace analytics feature:

* *otel_traces_raw* - The *otel_traces_raw* processor receives a collection of [span](https://github.com/opensearch-project/data-prepper/blob/fa65e9efb3f8d6a404a1ab1875f21ce85e5c5a6d/data-prepper-api/src/main/java/org/opensearch/dataprepper/model/trace/Span.java) records from [*otel-trace-source*]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/sources/otel-trace/), and performs stateful processing, extraction, and completion of trace-group-related fields.
* *otel_traces_group* - The *otel_traces_group* processor fills in the missing trace-group-related fields in the collection of [span](https://github.com/opensearch-project/data-prepper/blob/298e7931aa3b26130048ac3bde260e066857df54/data-prepper-api/src/main/java/org/opensearch/dataprepper/model/trace/Span.java) records by looking up the OpenSearch backend.
* *service_map_stateful* The *service_map_stateful* processor performs the required preprocessing for trace data and builds metadata to display the `service-map` dashboards.
* otel_traces_raw -- The *otel_traces_raw* processor receives a collection of [span](https://github.com/opensearch-project/data-prepper/blob/fa65e9efb3f8d6a404a1ab1875f21ce85e5c5a6d/data-prepper-api/src/main/java/org/opensearch/dataprepper/model/trace/Span.java) records from [*otel-trace-source*]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/sources/otel-trace-source/), and performs stateful processing, extraction, and completion of trace-group-related fields.
* otel_traces_group -- The *otel_traces_group* processor fills in the missing trace-group-related fields in the collection of [span](https://github.com/opensearch-project/data-prepper/blob/298e7931aa3b26130048ac3bde260e066857df54/data-prepper-api/src/main/java/org/opensearch/dataprepper/model/trace/Span.java) records by looking up the OpenSearch backend.
* service_map_stateful -- The *service_map_stateful* processor performs the required preprocessing for trace data and builds metadata to display the `service-map` dashboards.


### OpenSearch sink
Expand All @@ -49,8 +49,8 @@ OpenSearch provides a generic sink that writes data to OpenSearch as the destina

The sink provides specific configurations for the trace analytics feature. These configurations allow the sink to use indexes and index templates specific to trace analytics. The following OpenSearch indexes are specific to trace analytics:

* *otel-v1-apm-span* The *otel-v1-apm-span* index stores the output from the [otel_traces_raw]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/otel-trace-raw/) processor.
* *otel-v1-apm-service-map* The *otel-v1-apm-service-map* index stores the output from the [service_map_stateful]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/service-map-stateful/) processor.
* otel-v1-apm-span –- The *otel-v1-apm-span* index stores the output from the [otel_traces_raw]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/otel-trace-raw/) processor.
* otel-v1-apm-service-map –- The *otel-v1-apm-service-map* index stores the output from the [service_map_stateful]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/service-map-stateful/) processor.

## Trace tuning

Expand Down Expand Up @@ -374,4 +374,4 @@ Starting with Data Prepper version 1.4, trace processing uses Data Prepper's eve
* `otel_traces_group` replaces `otel_traces_group_prepper` for event-based spans.

In Data Prepper version 2.0, `otel_traces_source` will only output events. Data Prepper version 2.0 also removes `otel_traces_raw_prepper` and `otel_traces_group_prepper` entirely. To migrate to Data Prepper version 2.0, you can configure your trace pipeline using the event model.


Original file line number Diff line number Diff line change
@@ -1,22 +1,22 @@
---
layout: default
title: otel_trace_source source
title: otel_trace_source
parent: Sources
grand_parent: Pipelines
nav_order: 15
redirect_from:
- /data-prepper/pipelines/configuration/sources/otel-trace/
---


# otel_trace source
# otel_trace_source

## Overview

The `otel_trace` source is a source for the OpenTelemetry Collector. The following table describes options you can use to configure the `otel_trace` source.
`otel_trace_source` is a source for the OpenTelemetry Collector. The following table describes options you can use to configure the `otel_trace_source` source.


Option | Required | Type | Description
:--- | :--- | :--- | :---
port | No | Integer | The port that the `otel_trace` source runs on. Default value is `21890`.
port | No | Integer | The port that the `otel_trace_source` source runs on. Default value is `21890`.
request_timeout | No | Integer | The request timeout, in milliseconds. Default value is `10000`.
health_check_service | No | Boolean | Enables a gRPC health check service under `grpc.health.v1/Health/Check`. Default value is `false`.
unauthenticated_health_check | No | Boolean | Determines whether or not authentication is required on the health check endpoint. Data Prepper ignores this option if no authentication is defined. Default value is `false`.
Expand All @@ -35,6 +35,8 @@ authentication | No | Object | An authentication configuration. By default, an u

## Metrics

The 'otel_trace_source' source includes the following metrics.

### Counters

- `requestTimeouts`: Measures the total number of requests that time out.
Expand All @@ -50,4 +52,4 @@ authentication | No | Object | An authentication configuration. By default, an u

### Distribution summaries

- `payloadSize`: Measures the incoming request payload size distribution in bytes.
- `payloadSize`: Measures the incoming request payload size distribution in bytes.
1 change: 1 addition & 0 deletions _ingest-pipelines/processors/index-processors.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ Processor type | Description
`sort` | Sorts the elements of an array in ascending or descending order.
`sparse_encoding` | Generates a sparse vector/token and weights from text fields for neural sparse search using sparse retrieval.
`split` | Splits a field into an array using a separator character.
`text_chunking` | Splits long documents into smaller chunks.
`text_embedding` | Generates vector embeddings from text fields for semantic search.
`text_image_embedding` | Generates combined vector embeddings from text and image fields for multimodal neural search.
`trim` | Removes leading and trailing white space from a string field.
Expand Down
Loading

0 comments on commit 5755c4d

Please sign in to comment.