Orange3-Spark

A set of widgets for Orange data mining suite to work with Apache Spark ML API.

Requirements

Python >= 3.4
Pandas
Orange 3

Please follow the instruction to install Orange 3 first.

The main Orange project is hosted at: https://github.com/biolab/orange3 Download from: http://orange.biolab.si

Features

A Spark Context.
A Hive Table.
A Dataframe from an SQL Query.
A Dataset Builder, basically a call to VectorAssembler, this is usefull before sending data to Estimators.
Transformers from the feature module.
Estimators from classification module.
Estimators from regression module.
Estimators from clustering module.
Evaluation from evaluator module.
A PySpark script executor + PySpark console.
DataFrame transformes for Pandas and Orangle Tables

... more coming soon!

Installing

First, you need to have Apache Spark installed. Follow the instructions here: http://spark.apache.org/docs/latest/

Then you can do:

pip install Orange3-spark

or install the add-on from the Orange's Options | Add-ons menu. Note, if installing from Add-ons menu, the installation may fail if not all requirements are satisfiable.

If you require ODBC connectivity, you need to install pyodbc (which requires sql.h available if built with pip – that's unixodbc-dev package on Linux).

If install is ok, you should see a new section in Orange containing a series of widgets from Spark ML API.

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
doc		doc
orangecontrib		orangecontrib
trash		trash
COPYING		COPYING
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
meta.yaml		meta.yaml
screenshot.png		screenshot.png
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

Orange3-Spark

Requirements

Features

Installing

About

Licenses found

Releases 10

Packages

Contributors 3

Languages

License

Licenses found

jamartinh/Orange3-Spark

Folders and files

Latest commit

History

Repository files navigation

Orange3-Spark

Requirements

Features

Installing

About

Resources

License

Licenses found

Stars

Watchers

Forks

Releases 10

Packages 0

Contributors 3

Languages

Packages