Feature fraud detection 1.2.0 #148

AlexanderOllman · 2024-02-07T19:29:35Z

Provide a clear and concise description of the content changes you're proposing. List all the changes you are making
to the content.

Updated entire main notebook, providing more context to cells and splitting cells into sections.
Applied v1.2 variable fix (should already be present in release-1.2.0 version)
Updated README with new pyenv instructions, including new descriptive flow diagram (to be standard across new demos)

If there is no issue related to this PR, kindly create one first to describe the motivation behind these changes.

Checklist:

I have checked that my enhancements are not duplicates of existing content changes or additions.
I have tested the changes in a working environment to ensure they function as intended.
I have followed the style guide
outlined in the contribution guidelines.

Reviewer's Tasks (for maintainers reviewing this PR):

Verify that the tutorial functions correctly in a live environment.
Verify that the updated content aligns with the style guide
in the contribution guidelines.
Check for consistency, grammar, and clarity throughout the updated content.
Check that the related GitHub issue is up-to-date.

Signed-off-by: Dimitris Poulopoulos <[email protected]>

Use a set of public docker images by default, since new clusters do not have the permissions to pull images from the gcr.io registry. Signed-off-by: Dimitris Poulopoulos <[email protected]>

Signed-off-by: Dimitris Poulopoulos <[email protected]>

Change the default docker images

bike-sharing: Call the mlflow library after imports

Create a new 'applications' directory to house tutorias that are tailored to specific applications. Relocate the following tutorials to this new directory: * feast: ride-sharing * kubeflow-pipelines: financial-time-series * mlflow: bike-sharing * ray: news-recommendation Signed-off-by: Dimitris Poulopoulos <[email protected]>

Create a new 'integration-tutorials' directory to house tutorials that showcase the integration of different applications inside the EzUA platform. Relocate the following tutorials to this new directory: * fraud-detection * house-pricing * investment-banking * loan-approval * question-answering * wind-turbine Signed-off-by: Dimitris Poulopoulos <[email protected]>

Delete any unused or deprecated tutorial or file from the 'qzua-tutorials' catalogue. Signed-off-by: Dimitris Poulopoulos <[email protected]>

Signed-off-by: Dimitris Poulopoulos <[email protected]>

Revise the README file to explain the structure of the repository and detail how to get started, what are the requirements to follow these tutorials, as well as where to get help. Signed-off-by: Dimitris Poulopoulos <[email protected]>

Introduce the 'CONTRIBUTING.md' file to specify the contribution guidelines of the repository. Signed-off-by: Dimitris Poulopoulos <[email protected]>

Refactor the repository to adhere to the following structure for better readability and accessibility: * Demos: Quick, low-code guides that showcase the platform's features * Tutorials: Detailed, slow-paced guides designed to teach the functionalities of various tools Signed-off-by: Dimitris Poulopoulos <[email protected]>

Add a Pull Request (PR) template for content enhancements. This template provides sections that contributors should complete to create a thorough and detailed PR.

Introduce the GitHub issue templates through which contributors can submit bug reports or new tutorial requests. Signed-off-by: Dimitris Poulopoulos <[email protected]>

Add an `environment.yaml` file to capture the dependencies of the tutorial. This file creates a new conda environment, called `ride-sharing`, which installs every library the tutorial needs to run. Refs #62 Signed-off-by: Dimitris Poulopoulos <[email protected]>

Improve the Feast Ride Sharing tutorial by: - Updating the code to work with the latest version of the Feast Python client. - Extending the Notebook documentation and correcting syntax and grammar. - Using the training dataset ingested in Feast to train a simple model. - Simplifying the `definitions.py` file. Refs #62 Signed-off-by: Dimitris Poulopoulos <[email protected]>

Refs #62 Signed-off-by: Dimitris Poulopoulos <[email protected]>

Add a fresh README that: - Describes the tutorial's focus. - Outlines prerequisites for getting started. - Guides users on execution. - Includes a references section. Signed-off-by: Dimitris Poulopoulos <[email protected]>

Annotate the Notebooks of the MLflow example to provide more information about the tutorial and its execution. Closes #70 Signed-off-by: Dimitris Poulopoulos <[email protected]>

Remove the unnecessary dependencies from the `environment.yaml` file. Signed-off-by: Dimitris Poulopoulos <[email protected]>

Remove the KServe manifest. They are now generated as part of the Notebook execution. Signed-off-by: Dimitris Poulopoulos <[email protected]>

Update the README file to include information about the demo procedure and a references section. Signed-off-by: Dimitris Poulopoulos <[email protected]>

Improve the Notebook file by separating the model deployment section in a new Pipeline step. Also, set a new variable to hold the current user, as pipeline steps might have no notion of the environment variable "USER". Closes #68 Signed-off-by: Dimitris Poulopoulos <[email protected]>

Update the README file to follow the contributing guidelines. Refs #68 Signed-off-by: Dimitris Poulopoulos <[email protected]>

Enhance the Notebook user experience by: - Introducing a code cell to upload the dataset to its appropriate path prior to its use inside the Spark interactive session. This fixes the error where Spark tries to load the dataset from a location that does not exist. - Refining Notebook annotations for a clearer tutorial flow. Closes #64 Signed-off-by: Dimitris Poulopoulos <[email protected]>

Eliminate the reduntant Python script that duplicates the code from the notebook, offering no additional value to this tutorial. Signed-off-by: Dimitris Poulopoulos <[email protected]>

Enhance the README file by: - Introducing a 'Procedure' section that walks the user through the necessary steps for a successful run. - Incorporating a "How it Works" section that elucidates what Livy and Sparkmagic are, and how they collaboratively streamline interactions with a Spark cluster. - Including a 'References' section, providing links for extended reading. Signed-off-by: Dimitris Poulopoulos <[email protected]>

Make the LLM predictor Docker image lighter: - Remove the LangChain dependency: LangChain is not needed anymore since it was only used as a wrapper around the GPT4ALL Python library. The predictor now uses the GPT4ALL library directly to generate text. - Remove the model: The model is not part of the Docker image anymore. The predictor is not responsible to download the model during runtime. This also permits us to change the model type (i.e., architecture) by passing an argument to the pod command (--model). These changes yield a lighter image that is faster to build and push. Refs #57 Signed-off-by: Dimitris Poulopoulos <[email protected]>

Refine the LLM Transformer Docker image: - Clean up the code: Remove unused imports. - Introduce new features: add the `num_docs` argument for controlling the document retrieval count. - Upgrade the KServe dependency: Pin the KServe dependency to `0.11.0` instead of `0.11.0rc0` Refs #57 Signed-off-by: Dimitris Poulopoulos <[email protected]>

Enhance the Vector Store Docker image: - Support new features: Introduce the `num_docs` argument to control the document retrieval count. - Pin dependencies: Pin all dependencies to a version that works for this tutorial. - Fetch Torch+CPU: Download the CPU variant of PyTorch, which makes the image much lighter. Refs #57 Signed-off-by: Dimitris Poulopoulos <[email protected]>

Improve the Notebook file by: - Fixing typos and wording. - Fixing the code cells to adhere to the 69-character limit of PEP-8. Signed-off-by: Dimitris Poulopoulos <[email protected]>

Improve the README file by: - Adding a "How it Works" section. - Fixing typos and wording. Signed-off-by: Dimitris Poulopoulos <[email protected]>

Improve the Notebook file by: - Fixing typos and wording. - Change code cells and Python code to adhere to the 69-character limit of PEP-8. - Use `dataset` as the data derectory. We use this name to standardise the directory that houses the datasets. Signed-off-by: Dimitris Poulopoulos <[email protected]>

Improve the README file by: - Fixing typs and wording. - Adding a table of contents. - Adding a "How it Works" section. - Adding a "Clean Up" section. Signed-off-by: Dimitris Poulopoulos <[email protected]>

Improve the Notebooks by: - Adding table of contents. - Fixing typos and wording. Signed-off-by: Dimitris Poulopoulos <[email protected]>

Move the Bike Sharing example into the `demos` derectory as it integrates more than one tools (i.e., MLflow and KServe). Signed-off-by: Dimitris Poulopoulos <[email protected]>

Create a user interface, using the Gradio framework, for the fraud detection application: - Add the source code and the Dockerfile to build the Docker image. - Add the Helm chart that installs the application. - Amend the README instructions to include the new UI. Signed-off-by: Dimitris Poulopoulos <[email protected]>

prakashmirji · 2024-02-29T02:21:51Z

why so many files are deleted in this PR?

ask664 · 2024-03-12T08:06:29Z

without approval don't merge.

sercanCyberVision · 2024-03-12T15:45:23Z

Why do we delete all the Ray examples/ReadMe files and leave only one notebook for only one example?

All the detailed explanations, GPU example, fibonacci example are gone.

I have a ticket where I am working on restructuring/improving Ray example here https://jira-pro.it.hpe.com:8443/browse/EZAF-4409. With this ticket I will;

Keep all tutorials in one folder that named Ray.
Have two sub-folders as GPU and CPU.
Have 2 CPU and 1 GPU examples.
Check and update the read.me files if necessary.
Make sure that tutorials have their own points/purposes.

If we have an intention to improve/simplify our tutorials, lets decide some standards, create tickets for each app, and let the app owner do the changes.

The changes in this PR for Ray do not align with https://jira-pro.it.hpe.com:8443/browse/EZAF-4409. If we still need to merge this PR, @dpoulopoulos please exclude the changes related to Ray, I will be working on it.

@prakashmirji @prasadadireddi @ask664

sercanCyberVision · 2024-03-13T20:55:14Z

Please see Ray tutorial PR #155

Dimitris Poulopoulos and others added 30 commits September 1, 2023 18:36

question-answering: Transform raw cell to code cell

0b4ac6a

Signed-off-by: Dimitris Poulopoulos <[email protected]>

question-answering: Use a set of public images by default

e3c582f

Use a set of public docker images by default, since new clusters do not have the permissions to pull images from the gcr.io registry. Signed-off-by: Dimitris Poulopoulos <[email protected]>

bike-sharing: Call the mlflow library after imports

8750cb5

Signed-off-by: Dimitris Poulopoulos <[email protected]>

Merge pull request #40 from HPEEzmeral/feature-qna-public-images

683bcc3

Change the default docker images

Merge pull request #39 from HPEEzmeral/feature-fix-bike-sharing

d65f4da

bike-sharing: Call the mlflow library after imports

Delete unused tutorials and files

fd70a66

Delete any unused or deprecated tutorial or file from the 'qzua-tutorials' catalogue. Signed-off-by: Dimitris Poulopoulos <[email protected]>

chore: Enrich '.gitignore'

a7416c1

Signed-off-by: Dimitris Poulopoulos <[email protected]>

Update README

f6cb121

Revise the README file to explain the structure of the repository and detail how to get started, what are the requirements to follow these tutorials, as well as where to get help. Signed-off-by: Dimitris Poulopoulos <[email protected]>

Introduce the contributions guide

b4ad358

Introduce the 'CONTRIBUTING.md' file to specify the contribution guidelines of the repository. Signed-off-by: Dimitris Poulopoulos <[email protected]>

Introduce the PR template

60ff1fb

Add a Pull Request (PR) template for content enhancements. This template provides sections that contributors should complete to create a thorough and detailed PR.

Introduce the templates for new issues

bf44e11

Introduce the GitHub issue templates through which contributors can submit bug reports or new tutorial requests. Signed-off-by: Dimitris Poulopoulos <[email protected]>

feast: Delete the outdated Notebook file

ad2d565

Refs #62 Signed-off-by: Dimitris Poulopoulos <[email protected]>

feast: Add a README File

0f91fd8

Add a fresh README that: - Describes the tutorial's focus. - Outlines prerequisites for getting started. - Guides users on execution. - Includes a references section. Signed-off-by: Dimitris Poulopoulos <[email protected]>

mlflow: Improve the MLflow Notebooks

41ea2dd

Annotate the Notebooks of the MLflow example to provide more information about the tutorial and its execution. Closes #70 Signed-off-by: Dimitris Poulopoulos <[email protected]>

Amend the environment.yaml file

f405c4e

Remove the unnecessary dependencies from the `environment.yaml` file. Signed-off-by: Dimitris Poulopoulos <[email protected]>

mlflow: Remove not needed manifests

527a264

Remove the KServe manifest. They are now generated as part of the Notebook execution. Signed-off-by: Dimitris Poulopoulos <[email protected]>

mlflow: Amend the README file

f2e3b15

Update the README file to include information about the demo procedure and a references section. Signed-off-by: Dimitris Poulopoulos <[email protected]>

fraud-detection: Update the README file

10dd604

Update the README file to follow the contributing guidelines. Refs #68 Signed-off-by: Dimitris Poulopoulos <[email protected]>

wind-turbine: Remove the redundant Python script

ada544e

Eliminate the reduntant Python script that duplicates the code from the notebook, offering no additional value to this tutorial. Signed-off-by: Dimitris Poulopoulos <[email protected]>

Dimitris Poulopoulos and others added 15 commits October 13, 2023 17:46

wind-turbine: Improve the Notebook file

27a6d74

Improve the Notebook file by: - Fixing typos and wording. - Fixing the code cells to adhere to the 69-character limit of PEP-8. Signed-off-by: Dimitris Poulopoulos <[email protected]>

feast: Improve the README file

42e8b7d

Improve the README file by: - Adding a "How it Works" section. - Fixing typos and wording. Signed-off-by: Dimitris Poulopoulos <[email protected]>

mlflow: Improve the README file

5050744

Improve the README file by: - Fixing typs and wording. - Adding a table of contents. - Adding a "How it Works" section. - Adding a "Clean Up" section. Signed-off-by: Dimitris Poulopoulos <[email protected]>

mlflow: Improve the Notebooks

9c5f049

Improve the Notebooks by: - Adding table of contents. - Fixing typos and wording. Signed-off-by: Dimitris Poulopoulos <[email protected]>

Move the Bike Sharing example

6f2ca48

Move the Bike Sharing example into the `demos` derectory as it integrates more than one tools (i.e., MLflow and KServe). Signed-off-by: Dimitris Poulopoulos <[email protected]>

Updated Fraud Detection to 1.2.0

c63cd77

Added workflow diagram and variables to notebook. Updated README.

27ca9c7

changed README headers

fce3d3a

changed README headers

29b8cd0

attempting to remove bottom border

8165a57

trying another README header

a750ade

Updated README header

69ae752

Removed ToC from README, edited final sections of notebook

1773f94

AlexanderOllman changed the base branch from release-1.2.0 to release/fy24-q1 February 7, 2024 19:32

ask664 requested review from skandtandon, akravacyber, sercanCyberVision and AyushSinha5588 March 12, 2024 08:07

ask664 requested review from umka1332, Sourabh0511, Bhargavjd, mayankgarg-23 and jacob-gordon-hpe March 12, 2024 15:59

Update environment.yaml

9543de5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature fraud detection 1.2.0 #148

Feature fraud detection 1.2.0 #148

AlexanderOllman commented Feb 7, 2024

prakashmirji commented Feb 29, 2024

ask664 commented Mar 12, 2024

sercanCyberVision commented Mar 12, 2024

sercanCyberVision commented Mar 13, 2024

Feature fraud detection 1.2.0 #148

Are you sure you want to change the base?

Feature fraud detection 1.2.0 #148

Conversation

AlexanderOllman commented Feb 7, 2024

prakashmirji commented Feb 29, 2024

ask664 commented Mar 12, 2024

sercanCyberVision commented Mar 12, 2024

sercanCyberVision commented Mar 13, 2024