Skip to content

chuanlihao/spark-examples

 
 

Repository files navigation

This repo provides example applications that demonstrate the RAPIDS.ai GPU-accelerated XGBoost-Spark project.

There are three example apps included in this repo: Mortgage, Taxi, and Agaricus.

Build Examples Jar

Our example relies on cuDF and XGBoost

Follow these steps to build the jar:

git clone https://github.com/rapidsai/spark-examples.git
cd spark-examples
mvn package -DxgbClassifier=cuda10 # omit xgbClassifier for cuda 9.2

Getting Started Guides

Try one of the Getting Started guides below. Please note that they target the Mortage dataset as written, but with a few changes to EXAMPLE_CLASS, trainDataPath, and evalDataPath, they can be easily adapted to the Taxi or Agaricus datasets.

You can get a small size datasets for each example in the datasets folder. These datasets are only provided for convenience. In order to test for performance, please prepare a larger dataset by following Preparing Datasets. We also provide a larger dataset: Morgage Dataset (1 GB uncompressed), which is used in the guides below.

These examples use default parameters for demo purposes. For a full list please see Supported XGBoost Parameters.

Contact Us

Please see the RAPIDS website for contact information.

License

This content is licensed under the Apache License 2.0

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 66.3%
  • Scala 31.9%
  • Dockerfile 1.8%