A journey in to the world of Machine Learning algorithms using Apache Spark.
- Java
- Gradle
- Scala
- [Spark] (https://spark.apache.org/)
git clone https://github.com/ramuramaiah/spark-odyssey.git
./gradlew clean jar
The commands to run the Spark jobs are available in the batch files For e.g. To run the collocation algorithm, the colloc.bat has the following entry
colloc.bat
%SPARK_HOME%/bin/spark-submit ^
--class spark.odyssey.colloc.Driver ^
--jars file:///C:/mahout/lib/mahout-math-0.13.0.jar ^
--master local ^
--deploy-mode client ^
--driver-memory 4g ^
--executor-memory 2g ^
--executor-cores 1 ^
--queue colloc ^
build/libs/spark-odyssey.jar ^
--algo g_2 ^
-s 1 ^
"./build/resources/main/input_events.csv" ^
"./output"
- Spark - 2.1.0
- Raise one on github
- Send me a mail -> [email protected]