Release-3.0.0
Pre-releaseStarting from version 3.0, angel will maintain two separate version series: 2.X and 3.X.
We have added the hotfix-2, master-2 and develop-2 branches for 2.X series versions.
Angel has evolved from a single model training system to a comprehensive computing platform that includes all phases of machine learning: data preprocessing, model training, model services, automatic hyper-parameter tuning, and automatic feature engineering. Based on the Angel PS service, we built Angel's ecosystem: sona (Spark On Angel) and PyTorch On Angel. Our algorithms cover basic machine learning algorithms, deep learning algorithms, graph algorithms and GNN algorithms. In order to make the project structure more clear, we split the original project into 8 sub-projects:
angel: Angel's core layer, providing powerful parameter server function. Of course, you can use it to train the model independently.
PyTorch-On-Angel: A lightweight and high-performance distributed PyTorch computing platform based on Angel PS. It uses Angel's PS to support high-dimensional models and uses Spark as a PyTorch scheduling platform. It is easy to use as you can complete data preprocessing (using Spark) and model training (using PyTorch) altogether in one job. Similar to developing algorithms on PyTorch, users can simply use Python to design new algorithms on PyTorch On Angel platform . We have implemented a variety of algorithms in PyTorch On Angel: LR, FM, DeepFM, Wide & Deep, xDeepFM, GCN, GraphSage, etc, exhibiting higher performance (5x~10x) than those on Angel and sona. We stongly recommend you to use the PyTorch On Angel platform if you are more concerned with performance.
sona: A generic computing platform based on Angel PS and Spark that uses Angel PS to break through Spark's bottleneck of training high-dimensional models. In the new version of sona, we have done a lot of work to make the combination of feature engineering and model training better. We reconstruct LINE(LINE V2) and K-Core in this version and the performance and stability have been greatly improved.
serving: Angel's model serving platform, which is able to provide serving for not only models generated from Angel, Spark On Angel and PyTorch On Angel, but also those from other platforms, such as Spark, XGBoost, etc.
automl: A generic automatic machine learning component that includes automatic tuning and automatic feature engineering.
mlcore: Angel's independently developed lightweight computing graph framework. Users can easily implement new algorithms on it.
math2: Angel's independently developed high-performance math library, which involves a lot of performance optimization for large sparse vectors.
format: Angel's model format interface definition. Angel uses an open model format, enabling users to customize the needed format by implementing the model format interface.
New features
- [ISSUE-348] Support Kubernetes
- [ISSUE-845] PyTorch On Angel
- [ISSUE-678] Support GCN/GraphSage in PyTorch On Angel
- [ISSUE-846] sona reconstruct
- [ISSUE-790] automl
- [ISSUE-835] Add a LINE implementation version(LINE V2) that has better performence and runs more stable in sona
- [ISSUE-836] Reconstruct K-Core algorithm in sona