Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add tutuorial for cross-encoder model on sagemaker #2607

Merged
merged 5 commits into from
Jul 4, 2024

Conversation

ylwu-amzn
Copy link
Collaborator

Description

Build tutorial for Reranking with cross-encoder model on Sagemaker.

Issues Resolved

[List any issues this PR will resolve]

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@ylwu-amzn ylwu-amzn temporarily deployed to ml-commons-cicd-env July 3, 2024 00:39 — with GitHub Actions Inactive
env=hub,
role=role,
)
predictor = huggingface_model.deploy(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also add instruction for GPU usage? Since we have batch ingestion now, ingestion throughput can benefit a lot from using GPU.
Customer can choose GPU container from https://github.com/aws/deep-learning-containers/blob/master/available_images.md#huggingface-training-containers by using proper transformer + pytorch + py version and set a GPU instance like g4dn/g5.xlarge. Then the endpoint will use GPU for inference automatically.

Copy link
Collaborator Author

@ylwu-amzn ylwu-amzn Jul 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep this tutorial focusing on current topic. We can create a separate tutorial for GPU usage.
@xinyual , seems you have done some testing on GPU , can you help build a tutorial about how to use GPU on Sagemaker ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for late reply. I miss this message. We also want to create some docs for tutorial of neural sparse model. Maybe we will raise them together later.

Signed-off-by: Yaliang Wu <[email protected]>
@ylwu-amzn ylwu-amzn temporarily deployed to ml-commons-cicd-env July 3, 2024 01:36 — with GitHub Actions Inactive
Copy link
Contributor

@kolchfa-aws kolchfa-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments.


# Steps

## 0. Deploy Model on Sagemaker
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## 0. Deploy Model on Sagemaker
## 0. Deploy the model on Amazon SageMaker

# Steps

## 0. Deploy Model on Sagemaker
Use this code to deploy model on Sagemaker.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Use this code to deploy model on Sagemaker.
Use the following code to deploy the model on Amazon SageMaker:

instance_type='ml.m5.xlarge' # ec2 instance type
)
```
Find the model inference endpoint and note it. We will use it to create connector in next step
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Find the model inference endpoint and note it. We will use it to create connector in next step
Note the model inference endpoint; you'll use it to create a connector in the next step.

```
Find the model inference endpoint and note it. We will use it to create connector in next step

## 1. Create Connector and Model
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## 1. Create Connector and Model
## 1. Create a connector and register the model


## 1. Create Connector and Model

If you are using self-managed Opensearch, you should supply AWS credentials:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If you are using self-managed Opensearch, you should supply AWS credentials:
To create a connector for the model, send the following request. If you are using self-managed OpenSearch, supply your AWS credentials:

{ "passage_text" : "Capital punishment (the death penalty) has existed in the United States since beforethe United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states." }

```
### 2.2 Create reranking pipeline
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### 2.2 Create reranking pipeline
### 2.2 Create a reranking pipeline

]
}
```
Note: if you provide multiple filed names in `document_fields`, it will concat the value of all fields then do rerank.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Note: if you provide multiple filed names in `document_fields`, it will concat the value of all fields then do rerank.
Note: if you provide multiple filed names in `document_fields`, the values of all fields are first concatenated and then reranking is performed.

Note: if you provide multiple filed names in `document_fields`, it will concat the value of all fields then do rerank.
### 2.2 Test reranking

You can tune `size` if you want to return less result. For example, set `"size": 2` if you want to return top 2 documents.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can tune `size` if you want to return less result. For example, set `"size": 2` if you want to return top 2 documents.
To return a different number of results, provide the `size` parameter. For example, set `size` to `4` to return the top four documents:

}
}
```
Test without reranking pipeline:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Test without reranking pipeline:
Test the query without a reranking pipeline:

}
}
```
The first document in the response is `Carson City is the capital city of the American state of Nevada`, which is incorrect.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The first document in the response is `Carson City is the capital city of the American state of Nevada`, which is incorrect.
The first document in the response is `Carson City is the capital city of the American state of Nevada`, which is incorrect:

Signed-off-by: Yaliang Wu <[email protected]>
@ylwu-amzn ylwu-amzn temporarily deployed to ml-commons-cicd-env July 4, 2024 02:00 — with GitHub Actions Inactive
@ylwu-amzn
Copy link
Collaborator Author

Some comments.

Thanks , addressed all comments

@ylwu-amzn ylwu-amzn merged commit bffa32a into opensearch-project:main Jul 4, 2024
4 of 5 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jul 4, 2024
* add tutuorial for cross-encoder model on sagemaker

Signed-off-by: Yaliang Wu <[email protected]>

* add connector helper doc link

Signed-off-by: Yaliang Wu <[email protected]>

* remvoe title field

Signed-off-by: Yaliang Wu <[email protected]>

* address commnets

Signed-off-by: Yaliang Wu <[email protected]>

* use a better input format to invoke model

Signed-off-by: Yaliang Wu <[email protected]>

---------

Signed-off-by: Yaliang Wu <[email protected]>
(cherry picked from commit bffa32a)
ylwu-amzn added a commit that referenced this pull request Jul 4, 2024
* add tutuorial for cross-encoder model on sagemaker

Signed-off-by: Yaliang Wu <[email protected]>

* add connector helper doc link

Signed-off-by: Yaliang Wu <[email protected]>

* remvoe title field

Signed-off-by: Yaliang Wu <[email protected]>

* address commnets

Signed-off-by: Yaliang Wu <[email protected]>

* use a better input format to invoke model

Signed-off-by: Yaliang Wu <[email protected]>

---------

Signed-off-by: Yaliang Wu <[email protected]>
(cherry picked from commit bffa32a)

Co-authored-by: Yaliang Wu <[email protected]>
mingshl pushed a commit to mingshl/ml-commons that referenced this pull request Jul 8, 2024
…t#2607)

* add tutuorial for cross-encoder model on sagemaker

Signed-off-by: Yaliang Wu <[email protected]>

* add connector helper doc link

Signed-off-by: Yaliang Wu <[email protected]>

* remvoe title field

Signed-off-by: Yaliang Wu <[email protected]>

* address commnets

Signed-off-by: Yaliang Wu <[email protected]>

* use a better input format to invoke model

Signed-off-by: Yaliang Wu <[email protected]>

---------

Signed-off-by: Yaliang Wu <[email protected]>
@b4sjoo b4sjoo added the v2.16.0 Issues targeting release v2.16.0 label Jul 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x v2.16.0 Issues targeting release v2.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants