spark-rapids-examples/docs/get-started/xgboost-examples/prepare-package-data/preparation-python.md at main · NVIDIA/spark-rapids-examples · GitHub

Prepare packages and dataset for pyspark

For simplicity export the location to these jars. All examples assume the packages and dataset will be placed in the /opt/xgboost directory:

Download the jars

Download the RAPIDS Accelerator for Apache Spark plugin jar

RAPIDS Spark Package

Build XGBoost Python Examples

Following this guide, you can get samples.zip and main.py and copy them to /opt/xgboost

Download dataset

You need to copy the dataset to /opt/xgboost. Use the following links to download the data.