Skip to content

The publication is a collection of sample code to show how data from SAP and non-SAP systems can be made available for training in ANY hyperscaler machine learning service via several layers of abstraction from data connection to training using our FedML Python libraries.

License

Notifications You must be signed in to change notification settings

SAP-samples/datasphere-fedml

REUSE status

FedML

Description


The SAP Federated ML Python libraries (FedML) applies the Data Federation architecture of SAP Datasphere for intelligently sourcing SAP as well as non-SAP data for Machine Learning experiments done at any Machine Learning platform thereby removing the need for replicating or moving data. By abstracting data connection, data loading (for all ML platforms), model training (with flexibility and support for user-provided training scripts), model deployment, and inferencing (for hyperscaler machine learning platforms), the FedML library offers end-to-end integration with just a few lines of code.

What's New

1. The new version of FedML (available as fedml-dsp in PyPi, V1.0.0) :

  • Is machine learning platform-independent. It can be used in all machine learning platforms
  • Supports NVIDIA RAPIDS™, CUDA cuDF and cuPy and hence can be used for training models in GPU environments.
  • Supports sourcing data from SAP Datasphere models directly into PySpark and cuPy (for GPU) dataframes.
  • Supports SAP AI Core Deployment - Models that are trained in any ML Platform (and containerized independently) can now be deployed in SAP GenAI Hub's AI Core with couple lines of code.
  • Supports writing inferenced results back to SAP Datasphere.

Solution Architecture

ARD

2.FedML (Original, V2.0) for hyperscaler platforms [AWS, GCP, Azure and Databricks] :

  • Is pip installable from PyPi for its respective hyperscaler platforms.
  • Supports model training and deployment to hyperscaler environment.
  • Supports deployment to SAP Business Technology Platform Kyma environment.
  • Supports inferencing with hyperscaler deployed as well as Kyma deployed models.
  • Supports writing inferenced results back to SAP Datasphere.

Requirements

  • SAP Datasphere tenant instance, with connectivity established to the remote data sources, and views exposed, that can be consumed by FedML.

  • Access to corresponding Machine learning Platforms with appropriate configurations. See Configuration section.

Download and Installation

Try out examples from the samples-notebooks directory of corresponding library folders

Configuration

  • For FedML (platform-independent) library specific pre-requisites, configuration and documentation, please refer here
  • For AWS FedML library specific pre-requisites, configuration and documentation, please refer here
  • For GCP FedML library specific pre-requisites, configuration and documentation, please refer here
  • For Azure FedML library specific pre-requisites, configuration and documentation, please refer here
  • For Databricks FedML library specific pre-requisites, configuration and documentation, please refer here

Limitations

None

How to obtain support

This project is provided "as-is" with no expectation for major changes or support.
Create an issue in this repository if you find a bug or have questions about the content.
For additional support, ask a question in SAP Community.

Licensing

Copyright (c) 2021 SAP SE or an SAP affiliate company. All rights reserved. This project is licensed under the Apache Software License, version 2.0 except as noted otherwise in the LICENSE file.

About

The publication is a collection of sample code to show how data from SAP and non-SAP systems can be made available for training in ANY hyperscaler machine learning service via several layers of abstraction from data connection to training using our FedML Python libraries.

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published