Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated Ray version to 2.20.0 #530

Merged
merged 3 commits into from
Jun 19, 2024

Conversation

Bobbins228
Copy link
Contributor

@Bobbins228 Bobbins228 commented May 8, 2024

Issue link

RHOAIENG-6450

What changes have been made

Updated CFSDK ray dependency to 2.20.0

Verification steps

Setup

Notebook server ODH/RHOAI/Local

  • Clone this repository with git clone https://github.com/project-codeflare/codeflare-sdk.git
  • Checkout this PR's branch
  • Run poetry build - install if needed (pip install poetry)
  • Run pip install --force-reinstall dist/codeflare_sdk-0.0.0.dev0-py3-none-any.whl
  • Restart your notebook kernel

Testing

Use this image for the following test scenarios: quay.io/project-codeflare/ray:2.20.0-py39-cu118

  • Run through demo notebooks
  • Ensure GPU utilization is working correctly
  • Ensure basic & local interactive demos work correctly
  • Ensure job submission works correctly

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • Testing is not required for this change

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 8, 2024
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 8, 2024
@Bobbins228 Bobbins228 force-pushed the ray-220 branch 2 times, most recently from 3b9d029 to 0b9a6fa Compare May 8, 2024 13:17
@Fiona-Waters
Copy link
Contributor

Tested these changes through the SDK demo notebooks using quay.io/mcampbel/ray:220-py39-cu118-dev as the ray image.

  • Was able to run basic ray demo.
  • Was able to run job client demo and use gpus.
  • While running basic_interactive demo I see this info message (see screenshot below). I wasn't able to run through the entire demo due to issues with training script which I am currently looking into. But was able to run ray.init successfully.
  • Was able to run local interactive demo - including ray.init but also seeing the info about the python version mismatch
    image

This is not an error as such and I was able to submit jobs successfully.

Copy link
Contributor

@astefanutti astefanutti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

.github/workflows/e2e_tests.yaml Outdated Show resolved Hide resolved
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 15, 2024
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label May 27, 2024
@Bobbins228
Copy link
Contributor Author

/retest

@Bobbins228
Copy link
Contributor Author

The e2e tests fail here due to the CertGenerator image on the CFO side when setting up the Kind Cluster.
Here is proof of a passing e2e test with the new Ray image for the CertGenerator.

I have made a PR to update the CFO with the new image

@Bobbins228
Copy link
Contributor Author

/retest

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 19, 2024
@Fiona-Waters
Copy link
Contributor

/approve

Copy link
Contributor

openshift-ci bot commented Jun 19, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Fiona-Waters, Srihari1192

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 19, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit c51ab98 into project-codeflare:main Jun 19, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants