You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For computing gradient for a circuit with expectation values of a Hamiltonian object, Lightning implements OpenMP parallelized function that distributes Hamiltonian terms to threads:
However, the Util::scaleAndAdd function calls OpenBLAS's cblas_caxpy or cblas_zaxpy when compiled with OpenBLAS, which is the case for the PyPI-provided wheels. As these functions are parallelized internally by OpenBLAS, turning off the internal parallelization of OpenBLAS might be necessary to prevent threads oversubscription (or vice versa).
Edit: Indeed, it's subtle. Locally, I found that turning off OpenBLAS parallelism is better performing, but the opposite in Perlmutter.
The text was updated successfully, but these errors were encountered:
Yea, this was always a tough problem. It depends on CPU model, observable type, and even circuit type.
We could try updating the OpenMP scheduling here in the section, and seeing if it works nicely.
for defaults though, at least until we better understand the path we need, I'd say favouring large-scale CPUs (HPC systems, AWS Braket servers) would be better as the default.
For computing gradient for a circuit with expectation values of a Hamiltonian object, Lightning implements OpenMP parallelized function that distributes Hamiltonian terms to threads:
pennylane-lightning/pennylane_lightning/src/simulator/Observables.hpp
Lines 309 to 347 in 58c9e1c
However, the
Util::scaleAndAdd
function calls OpenBLAS'scblas_caxpy
orcblas_zaxpy
when compiled with OpenBLAS, which is the case for the PyPI-provided wheels. As these functions are parallelized internally by OpenBLAS, turning off the internal parallelization of OpenBLAS might be necessary to prevent threads oversubscription (or vice versa).Edit: Indeed, it's subtle. Locally, I found that turning off OpenBLAS parallelism is better performing, but the opposite in Perlmutter.
The text was updated successfully, but these errors were encountered: