Intel® Optimization for Horovod* is the distributed training framework for TensorFlow*. The goal is to make distributed Deep Learning workload run faster and easier to use on Intel GPU devices. It's developed based on latest release version v0.28.1 of public Horovod.
- Intel® Data Center GPU Max Series, Driver Version: 803
Software | Installation requirement |
---|---|
Intel® oneAPI Base Toolkit | Install Intel® oneAPI Base Toolkit |
TensorFlow | Install tensorflow 2.15.1 |
Intel® Extension for TensorFlow* | Install Intel® Extension for TensorFlow* |
System | Ubuntu 22.04, SUSE Linux Enterprise Server(SLES) 15 SP3/SP4 |
Python | 3.9-3.11 |
Pip | 19.0 or later (requires manylinux2014 support) |
OS | Intel GPU | Install Intel GPU Driver |
---|---|---|
Ubuntu 22.04, RedHat 8.6, SLES 15 SP3/SP4 | Intel® Data Center GPU Max Series | Refer to the Installation Guides for latest driver installation. If install the verified Intel® Data Center GPU Max Series/Intel® Data Center GPU Flex Series 803, please append the specific version after components. |
Intel® Optimization for Horovod* can be installed through the following channels:
PyPI | Source |
---|---|
Install from pip | Build from source |
Installing Intel® Optimization for Horovod* with different frameworks is feasible. You could choose Intel® Extension for TensorFlow* as dependency.
- Installing Intel® Extension for TensorFlow* and Intel® Optimization for Horovod* with command:
pip install tensorflow==2.15.1 pip install --upgrade intel-extension-for-tensorflow[xpu] pip install intel-optimization-for-horovod
The example commands below show how to run distributed training.
-
To run on a machine with 2 Intel GPUs, which have 4 titles totally.
horovodrun -np 4 python train.py
-
To run on 4 machines with 2 GPUs(4 tiles) each:
horovodrun -np 16 -H server1:4,server2:4,server3:4,server4:4 python train.py
It is easy to train models with Intel® Extension for TensorFlow. You can refer to tensorflow examples for more details.