Skip to content

Release-1.2.1

Compare
Choose a tag to compare
@paynie paynie released this 26 Oct 12:17
· 155 commits to branch-1.2.1 since this release

Release-1.2.1

Angel 1.2.1 实现了全新的模型输出格式和加载/转换工具,并对算法库做了较多的优化,提供了可配置模型格式的LR算法。此外,Spark On Angel的接口也被进一步的重构和优化,并带来了Spark on Angel版本的GBDT算法

Angel Core

  • 模型格式重构,优化模型输出文件多的问题
  • 采用并发方式加载和导出模型
  • 全新的模型加载和格式转换工具
  • 稀疏矩阵计算性能优化

Angel Mllib

  • LR:可通过配置参数选择稠密和稀疏模型格式
  • GBDT:优化树数量多时的性能问题;增加两阶段分裂和低精度压缩的psFunc;修复特征下采样的索引问题和参数初始化问题
  • LDA:使用PSF更新模型,优化内存使用,加入WarpLDA的变种算法
  • GradientDescent/Loss接口泛型化,支持dense double, sparse double和sparse double with longkey三种模型格式

Spark On Angel

  • 接口优化和改进
    * PSClient分离成Initializer,VectorOps,MatrixOps
    * BreezePSVector和CachePSVector优化
  • 新增GBDT算法

不兼容升级

  • 【重要】PSModel类移除声明时泛型,通过setRowType类设置类型

文档

  • 新增辅助工具类说明文档:指标使用说明,模型加载/转换使用说明
  • 持续的文档国际化
  • 更新Spark On Angel和部分算法文档

~~~华丽的致谢分割线~~~

Angel 1.2.1的发布,继续得到各地的Contributors的协助。感谢如下的开发者为这次发布做出的贡献:

  1. shunanzhang持续的高质量文档翻译
  2. chriswarplda实现
  3. cstur4模型加载优化

同时 ,并对QQ群里诸多公司用户的热心反馈和意见,深表谢意

Release-1.2.1

Angel 1.2.1 added new model output format and loading/conversion tools, improved the algorithm library, and provided Logistic Regression with configurable model format. Spark on Angel interface has been further refactored and improved, with GBDT algorithm introduced.

Angel Core

  • Refactor model format to solve problem of too many output files
  • Introduce concurrent mode in model load/export
  • Provided new tools for model load/convert
  • Improved performance of sparse matrix computation

Angel MLlib

  • LR: model format made configurable: dense/sparse
  • GBDT: improved performance when there is large number of trees; added psFunc for the two-stage splitting algorithm and low-precision compression; fixed indexing problem and parameter initialization problem in feature sampling
  • LDA: enabled using PSF to update model; improved memory usage; added WarpLDA variant
  • GradientDescent/Loss interface is made generic to support three model formats: dense double, sparse double and sparse double with longkey

Spark on Angel

  • Improved interfaces
    • Separated PSClient into Initializer, VectorOps and MatrixOps
    • Improved BreezePSVector and CachePSVector
  • Added GBDT

Compatibility

  • IMPORTANT: removed generic declaration for PSModel; parameter type will be configured by setRowType

Documentation

  • Added documentation for assistant classes: metrics, model loading/conversion
  • Continuous translation of documentation
  • Updated documentation for Spark on Angel and a few algorithms

~~~ Acknowledgement ~~~

We continue to receive help from developers from all over the world for Angel 1.2.1. We thank developers who contributed to the new release:

Meanwhile, we received many helpful feedback and suggestions from the Angel QQ group, and we are greatly thankful.