Type-safe, high performance, distributed Neural networks in Scala (not Python, finally...).
Low level (linear algebra) operations powered by low level TensorFlow API (C, C++ bindings via JNI).
Scala used to build computation graphs and compile them into native tensor graphs.
Compiled graphs are fully calculated in native code (on CPU, GPU or TPU)
and only result is returned back via DirectBuffer
which points into native memory.
DirectBuffer
is wrapped with Tensor
read-only object which allows
to slice and read data in a convenient way (just like Breeze
or Numpy
does).
The optimizer is built on top of Spark
and can optimize the model in a distributed/parallel way.
The chosen algorithm - Data parallelism with synchronous model averaging
. The dataset is split between
the workers and each epoch is run independently on each data split, at the end of each epoch
parameters are averaged and broadcasted back to each worker.
The input data is expected to be Dataset[Array[TensorType]
and it contains a shape of the tensors in metadata.
Usually, TensorType
is choosen to be Float
since it performs best on GPU, also Double
can be used.
Example of a simple MNIST dataset classifier with Fully Connected Neural Network:
val (trainingDs, testDs) = MNIST.load(sc, trainingSize = 30000)
val model = Dense(50, Sigmoid) >> Dense(10, Softmax)
val trained = trainingDs.train(model)
.loss(CategoricalCrossentropy)
.using(Adam(0.01f))
.batch(1000)
.each(1.epochs, RecordLoss(tensorboard = true))
.each(10.epochs, RecordAccuracy(testDs, tensorboard = true))
.stopAfter(200.epochs)
.run()
accuracy(trained, testDs) should be >= 0.95f
Here, loss
and accuracy
will be logged and added to TensorBoard
as live trends. To run tensorboard execute:
pip install tensorboard
tensorboard --logdir board
Same but with CNN (Convolutional Neural Network)
val (trainingDs, testDs) = MNIST()
val model =
Conv2D(32, activation = ReLU()) >> Pool2D() >>
Conv2D(64, activation = ReLU()) >> Pool2D() >>
Flatten >> Dense(10, Softmax)
val trained = trainingDs
.train(model)
.loss(CategoricalCrossentropy)
.using(Adam(0.001f))
.batch(100)
.each(1.epochs, RecordLoss(tensorboard = true))
.each(1.epochs, RecordAccuracy(testDs, tensorboard = true))
.stopAfter(3.epochs)
.run()
accuracy(trained, testDs) should be >= 0.98f
LSTM Layer to forecast sunspots
val Array(train, test) = monthlySunspots(12).randomSplit(Array(0.8, 0.2), 1)
val model = LSTM(2) >> Dense(1, Tanh)
val trained = train
.train(model)
.loss(MeanSquaredError)
.using(Adam())
.batch(10)
.each(1.epochs, RecordLoss(tensorboard = true))
.stopAfter(100.epochs)
.run()
RMSE(trained, test) should be < 0.2f
R2Score(trained, test) should be > 0.8f
- Tensor
- DSL for computation DAG
- TF Session
- Core ops
- Math ops
- Logical ops
- String ops
- TF Functions, Placeholders, Session caching
- Tensor Board basic support
- Spark
- Hyper parameter tuning
- Model Import/Export
- SGD
- AdaGrad
- AdaDelta
- RMSProp
- Adam
- Nadam
- Adamax
- AMSGrad
- Variance/STD
- Covariance/Correlation Matrix
- Lots of other useful algs to analyze the data set
- Linear Regression
- Binary Logistic Regression
- ANN (Multilayer Perceptron NN)
- Kernel regularization
- Convolutional NN
- Recurrent NN (Simple, LSTM))
- Recurrent NN Enhancements
- Add GRU Cell
- Add LSTM GPU implementation LSTM: add GPU implementation, see
tensorflow/python/keras/layers/recurrent_v2.py
line 1655 - Add RNN unroll option, see tf.while_loop
- Add state between batches
- Try LSTM weights fusing (x4 less weights)
- Layers Dropout (provide random generator to layers)(Wanted!)
- Batch Normalization
- others
- Object Localization
- Region Proposals (Selective Search, EdgeBoxes, etc..)
- R-CNN
- Fast R-CNN
- Faster R-CNN
- YOLO (You only look once)
- SSD (Single-Shot MultiBox Detector)
- Sigmoid
- Tanh
- RELU
- Softmax
- Exp
- SELU
- ELU
- Sofplus
- RMSE (Mean Squared Error)
- Binary Crossentropy
- Categorical Crossentropy
- Sparse Categorical Crossentropy
- Boston Housing price regression dataset
- MNIST
- Fashion MNIST
- CIFAR-10
- CIFAR-100
- ILSVRC (ImageNet-1000)
- Pascal VOC
- SVD/PCA/Whitening
- Feature scalers
- Feature embedding
- Hashed features
- Crossed features
- r2 score
- accuracy estimator,
- confusion matrix, precision, recall, f1 score
- runtime estimating and new stop condition based on that
- LeNet
- AlexNet
- ZF Net
- ZF Net
- VGGNet
- GoogLeNet
- ResNet
- ...
- Create computation intensive operation, like
matmul
multiple times large tensors and compare with Scalabreeze
, pythontensorflow
, pythonnumpy
- Compare with existing implementations using local CPU
- Compare with existing implementations using one GPU
- Compare with existing implementations using distributed mode on GCP DataProc
- While training analyze the weights histograms to make sure the deep NN do not saturate
- Grid/Random hyper parameters search
- Different weight initializers (Xavier)
- Decay learning rate over time (step, exponential, 1/t decay)
- Try using in interactive notebook
- Add graph library so we could plot some charts and publish them in
tensorboard
ornotebook
(maybe fork and upgradevegas
to scala2.12
ot tryevil-plot
)
- Redefine the way we train model on a dataset and make a prediction.
We should cover 2 cases: BigData with
spark
which can train and predict on large datasets and single (batch) prediction withoutspark
dependency (to be able to expose model via API or use in realtime). For that we need to:- separate project into
core
+spark
modules. - implement model weights export/import
- implement feature preprocessing, for training use case try using
MLib
, yet we need to figure out how to transform features via regular function withoutspark
involved - integrating with
MLib
might require redefining theDataset[Record[A]]
we have right now probably better to use any abstract dataset which contains 2 required columnsfeatures
+labels
for training andfeatures
for prediction.
- separate project into
- Add DSL to build tensor requirements like
tensor require rank(4)
,tensor require shape squratedMatrix
- We might need to define high level untyped trait
Node
whichExpr[A]
trait will extend SuchNode
will have a defined compiler, to make itExpr
we would need choose an output and assign a type
If you want to become a contributor, you are welcome!!! You can pick anything from a Road Map or propose your idea.
Please, contact: