From 250c9076795bedce0cf359d6aaaf049a4d450d4d Mon Sep 17 00:00:00 2001
From: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Date: Sun, 22 Dec 2024 01:46:15 -0500
Subject: [PATCH 1/4] docs: update deepmd-gnn URL (#4482)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Documentation**
- Updated guidelines for creating and integrating new models in the
DeePMD-kit framework.
- Added new sections on descriptors, fitting networks, and model
requirements.
	- Enhanced unit testing section with instructions for regression tests.
- Updated URL for the DeePMD-GNN plugin to reflect new repository
location.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
---
 doc/development/create-a-model-pt.md | 2 +-
 doc/third-party/out-of-deepmd-kit.md | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/development/create-a-model-pt.md b/doc/development/create-a-model-pt.md
index 08528cc5f6..7eb75b7026 100644
--- a/doc/development/create-a-model-pt.md
+++ b/doc/development/create-a-model-pt.md
@@ -180,7 +180,7 @@ The arguments here should be consistent with the class arguments of your new com
 ## Package new codes
 
 You may package new codes into a new Python package if you don't want to contribute it to the main DeePMD-kit repository.
-A good example is [DeePMD-GNN](https://github.com/njzjz/deepmd-gnn).
+A good example is [DeePMD-GNN](https://gitlab.com/RutgersLBSR/deepmd-gnn).
 It's crucial to add your new component to `project.entry-points."deepmd.pt"` in `pyproject.toml`:
 
 ```toml
diff --git a/doc/third-party/out-of-deepmd-kit.md b/doc/third-party/out-of-deepmd-kit.md
index 12ae5842c7..a04ba9741b 100644
--- a/doc/third-party/out-of-deepmd-kit.md
+++ b/doc/third-party/out-of-deepmd-kit.md
@@ -6,7 +6,7 @@ The codes of the following interfaces are not a part of the DeePMD-kit package a
 
 ### External GNN models (MACE/NequIP)
 
-[DeePMD-GNN](https://github.com/njzjz/deepmd-gnn) is DeePMD-kit plugin for various graph neural network (GNN) models.
+[DeePMD-GNN](https://gitlab.com/RutgersLBSR/deepmd-gnn) is DeePMD-kit plugin for various graph neural network (GNN) models.
 It has interfaced with [MACE](https://github.com/ACEsuit/mace) (PyTorch version) and [NequIP](https://github.com/mir-group/nequip) (PyTorch version).
 It is also the first example to the DeePMD-kit [plugin mechanism](../development/create-a-model-pt.md#package-new-codes).
 

From deaeec9c9b1f51c9b306724eb6e8d195755ac8dd Mon Sep 17 00:00:00 2001
From: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Date: Sun, 22 Dec 2024 01:46:47 -0500
Subject: [PATCH 2/4] docs: update DPA-2 citation (#4483)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Updated references in the bibliography for the DPA-2 model to include
a new article entry for 2024.
	- Added a new reference for an attention-based descriptor.

- **Bug Fixes**
- Corrected reference links in documentation to point to updated DOI
links instead of arXiv.

- **Documentation**
- Revised entries in the credits and model documentation to reflect the
latest citations and details.
- Enhanced clarity and detail in fine-tuning documentation for
TensorFlow and PyTorch implementations.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
---
 CITATIONS.bib                      | 32 +++++++++++++++---------------
 deepmd/dpmodel/descriptor/dpa2.py  |  7 ++++++-
 deepmd/pt/model/descriptor/dpa2.py |  7 ++++++-
 doc/credits.rst                    |  2 +-
 doc/model/dpa2.md                  |  2 +-
 doc/train/finetuning.md            |  2 +-
 doc/train/multi-task-training.md   |  2 +-
 7 files changed, 32 insertions(+), 22 deletions(-)

diff --git a/CITATIONS.bib b/CITATIONS.bib
index d5524a14f6..52c8045bf3 100644
--- a/CITATIONS.bib
+++ b/CITATIONS.bib
@@ -128,26 +128,26 @@ @article{Zhang_NpjComputMater_2024_v10_p94
   doi          = {10.1038/s41524-024-01278-7},
 }
 
-@misc{Zhang_2023_DPA2,
+@article{Zhang_npjComputMater_2024_v10_p293,
   annote       = {DPA-2},
   author       = {
     Duo Zhang and Xinzijian Liu and Xiangyu Zhang and Chengqian Zhang and Chun
-    Cai and Hangrui Bi and Yiming Du and Xuejian Qin and Jiameng Huang and
-    Bowen Li and Yifan Shan and Jinzhe Zeng and Yuzhi Zhang and Siyuan Liu and
-    Yifan Li and Junhan Chang and Xinyan Wang and Shuo Zhou and Jianchuan Liu
-    and Xiaoshan Luo and Zhenyu Wang and Wanrun Jiang and Jing Wu and Yudi Yang
-    and Jiyuan Yang and Manyi Yang and Fu-Qiang Gong and Linshuang Zhang and
-    Mengchao Shi and Fu-Zhi Dai and Darrin M. York and Shi Liu and Tong Zhu and
-    Zhicheng Zhong and Jian Lv and Jun Cheng and Weile Jia and Mohan Chen and
-    Guolin Ke and Weinan E and Linfeng Zhang and Han Wang
+    Cai and Hangrui Bi and Yiming Du and Xuejian Qin and Anyang Peng and
+    Jiameng Huang and Bowen Li and Yifan Shan and Jinzhe Zeng and Yuzhi Zhang
+    and Siyuan Liu and Yifan Li and Junhan Chang and Xinyan Wang and Shuo Zhou
+    and Jianchuan Liu and Xiaoshan Luo and Zhenyu Wang and Wanrun Jiang and
+    Jing Wu and Yudi Yang and Jiyuan Yang and Manyi Yang and Fu-Qiang Gong and
+    Linshuang Zhang and Mengchao Shi and Fu-Zhi Dai and Darrin M. York and Shi
+    Liu and Tong Zhu and Zhicheng Zhong and Jian Lv and Jun Cheng and Weile Jia
+    and Mohan Chen and Guolin Ke and Weinan E and Linfeng Zhang and Han Wang
   },
-  title        = {
-    {DPA-2: Towards a universal large atomic model for molecular and material
-    simulation}
-  },
-  publisher    = {arXiv},
-  year         = 2023,
-  doi          = {10.48550/arXiv.2312.15492},
+  title        = {{DPA-2: a large atomic model as a multi-task learner}},
+  journal      = {npj Comput. Mater},
+  year         = 2024,
+  volume       = 10,
+  number       = 1,
+  pages        = 293,
+  doi          = {10.1038/s41524-024-01493-2},
 }
 
 @article{Zhang_PhysPlasmas_2020_v27_p122704,
diff --git a/deepmd/dpmodel/descriptor/dpa2.py b/deepmd/dpmodel/descriptor/dpa2.py
index e4cadb7b36..55ae331593 100644
--- a/deepmd/dpmodel/descriptor/dpa2.py
+++ b/deepmd/dpmodel/descriptor/dpa2.py
@@ -387,7 +387,7 @@ def __init__(
         use_tebd_bias: bool = False,
         type_map: Optional[list[str]] = None,
     ) -> None:
-        r"""The DPA-2 descriptor. see https://arxiv.org/abs/2312.15492.
+        r"""The DPA-2 descriptor[1]_.
 
         Parameters
         ----------
@@ -434,6 +434,11 @@ def __init__(
         sw:                 torch.Tensor
             The switch function for decaying inverse distance.
 
+        References
+        ----------
+        .. [1] Zhang, D., Liu, X., Zhang, X. et al. DPA-2: a
+           large atomic model as a multi-task learner. npj
+           Comput Mater 10, 293 (2024). https://doi.org/10.1038/s41524-024-01493-2
         """
 
         def init_subclass_params(sub_data, sub_class):
diff --git a/deepmd/pt/model/descriptor/dpa2.py b/deepmd/pt/model/descriptor/dpa2.py
index c8e430960b..f086a346b6 100644
--- a/deepmd/pt/model/descriptor/dpa2.py
+++ b/deepmd/pt/model/descriptor/dpa2.py
@@ -100,7 +100,7 @@ def __init__(
         use_tebd_bias: bool = False,
         type_map: Optional[list[str]] = None,
     ) -> None:
-        r"""The DPA-2 descriptor. see https://arxiv.org/abs/2312.15492.
+        r"""The DPA-2 descriptor[1]_.
 
         Parameters
         ----------
@@ -147,6 +147,11 @@ def __init__(
         sw:                 torch.Tensor
             The switch function for decaying inverse distance.
 
+        References
+        ----------
+        .. [1] Zhang, D., Liu, X., Zhang, X. et al. DPA-2: a
+           large atomic model as a multi-task learner. npj
+           Comput Mater 10, 293 (2024). https://doi.org/10.1038/s41524-024-01493-2
         """
         super().__init__()
 
diff --git a/doc/credits.rst b/doc/credits.rst
index 1b39dc1e0e..059746ee0b 100644
--- a/doc/credits.rst
+++ b/doc/credits.rst
@@ -54,7 +54,7 @@ Cite DeePMD-kit and methods
 .. bibliography::
    :filter: False
 
-   Zhang_2023_DPA2
+   Zhang_npjComputMater_2024_v10_p293
 
 - If frame-specific parameters (`fparam`, e.g. electronic temperature) is used,
 
diff --git a/doc/model/dpa2.md b/doc/model/dpa2.md
index eb641d6b01..300876bf05 100644
--- a/doc/model/dpa2.md
+++ b/doc/model/dpa2.md
@@ -4,7 +4,7 @@
 **Supported backends**: PyTorch {{ pytorch_icon }}, JAX {{ jax_icon }}, DP {{ dpmodel_icon }}
 :::
 
-The DPA-2 model implementation. See https://arxiv.org/abs/2312.15492 for more details.
+The DPA-2 model implementation. See https://doi.org/10.1038/s41524-024-01493-2 for more details.
 
 Training example: `examples/water/dpa2/input_torch_medium.json`, see [README](../../examples/water/dpa2/README.md) for inputs in different levels.
 
diff --git a/doc/train/finetuning.md b/doc/train/finetuning.md
index cf2f5fde4f..04d86cfc98 100644
--- a/doc/train/finetuning.md
+++ b/doc/train/finetuning.md
@@ -94,7 +94,7 @@ The model section will be overwritten (except the `type_map` subsection) by that
 
 #### Fine-tuning from a multi-task pre-trained model
 
-Additionally, within the PyTorch implementation and leveraging the flexibility offered by the framework and the multi-task training process proposed in DPA2 [paper](https://arxiv.org/abs/2312.15492),
+Additionally, within the PyTorch implementation and leveraging the flexibility offered by the framework and the multi-task training process proposed in DPA2 [paper](https://doi.org/10.1038/s41524-024-01493-2),
 we also support more general multitask pre-trained models, which includes multiple datasets for pre-training. These pre-training datasets share a common descriptor while maintaining their individual fitting nets,
 as detailed in the paper above.
 
diff --git a/doc/train/multi-task-training.md b/doc/train/multi-task-training.md
index 51dffcc5f5..16f6c0e05c 100644
--- a/doc/train/multi-task-training.md
+++ b/doc/train/multi-task-training.md
@@ -26,7 +26,7 @@ and the Adam optimizer is executed to minimize $L^{(t)}$ for one step to update
 In the case of multi-GPU parallel training, different GPUs will independently select their tasks.
 In the DPA-2 model, this multi-task training framework is adopted.[^1]
 
-[^1]: Duo Zhang, Xinzijian Liu, Xiangyu Zhang, Chengqian Zhang, Chun Cai, Hangrui Bi, Yiming Du, Xuejian Qin, Jiameng Huang, Bowen Li, Yifan Shan, Jinzhe Zeng, Yuzhi Zhang, Siyuan Liu, Yifan Li, Junhan Chang, Xinyan Wang, Shuo Zhou, Jianchuan Liu, Xiaoshan Luo, Zhenyu Wang, Wanrun Jiang, Jing Wu, Yudi Yang, Jiyuan Yang, Manyi Yang, Fu-Qiang Gong, Linshuang Zhang, Mengchao Shi, Fu-Zhi Dai, Darrin M. York, Shi Liu, Tong Zhu, Zhicheng Zhong, Jian Lv, Jun Cheng, Weile Jia, Mohan Chen, Guolin Ke, Weinan E, Linfeng Zhang, Han Wang, [arXiv preprint arXiv:2312.15492 (2023)](https://arxiv.org/abs/2312.15492) licensed under a [Creative Commons Attribution (CC BY) license](http://creativecommons.org/licenses/by/4.0/).
+[^1]: Duo Zhang, Xinzijian Liu, Xiangyu Zhang, Chengqian Zhang, Chun Cai, Hangrui Bi, Yiming Du, Xuejian Qin, Anyang Peng, Jiameng Huang, Bowen Li, Yifan Shan, Jinzhe Zeng, Yuzhi Zhang, Siyuan Liu, Yifan Li, Junhan Chang, Xinyan Wang, Shuo Zhou, Jianchuan Liu, Xiaoshan Luo, Zhenyu Wang, Wanrun Jiang, Jing Wu, Yudi Yang, Jiyuan Yang, Manyi Yang, Fu-Qiang Gong, Linshuang Zhang, Mengchao Shi, Fu-Zhi Dai, Darrin M. York, Shi Liu, Tong Zhu, Zhicheng Zhong, Jian Lv, Jun Cheng, Weile Jia, Mohan Chen, Guolin Ke, Weinan E, Linfeng Zhang, Han Wang, DPA-2: a large atomic model as a multi-task learner. npj Comput Mater 10, 293 (2024). [DOI: 10.1038/s41524-024-01493-2](https://doi.org/10.1038/s41524-024-01493-2) licensed under a [Creative Commons Attribution (CC BY) license](http://creativecommons.org/licenses/by/4.0/).
 
 Compared with the previous TensorFlow implementation, the new support in PyTorch is more flexible and efficient.
 In particular, it makes multi-GPU parallel training and even tasks beyond DFT possible,

From 2525ab2a4ea0097baec842055f713eceddcb01af Mon Sep 17 00:00:00 2001
From: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Date: Sun, 22 Dec 2024 01:47:09 -0500
Subject: [PATCH 3/4] docs: fix a minor typo on the title of
 `install-from-c-library.md` (#4484)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Documentation**
- Updated formatting of the installation guide for the pre-compiled C
library.
- Icons for TensorFlow and JAX are now displayed together in the header.
	- Retained all installation instructions and compatibility notes.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
---
 doc/install/install-from-c-library.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/install/install-from-c-library.md b/doc/install/install-from-c-library.md
index d408fb1b67..806be51ca9 100644
--- a/doc/install/install-from-c-library.md
+++ b/doc/install/install-from-c-library.md
@@ -1,4 +1,4 @@
-# Install from pre-compiled C library {{ tensorflow_icon }}, JAX {{ jax_icon }}
+# Install from pre-compiled C library {{ tensorflow_icon }} {{ jax_icon }}
 
 :::{note}
 **Supported backends**: TensorFlow {{ tensorflow_icon }}, JAX {{ jax_icon }}

From cfe17a3e3e2fd198a42d9591d203bd2975c72824 Mon Sep 17 00:00:00 2001
From: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Date: Sun, 22 Dec 2024 01:47:38 -0500
Subject: [PATCH 4/4] fix: print dlerror if dlopen fails (#4485)

xref: https://github.com/njzjz/deepmd-gnn/issues/44

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Enhanced error messages for library loading failures on non-Windows
platforms.
- Updated thread management environment variable checks for improved
compatibility.
- Added support for mixed types in tensor input handling, allowing for
more flexible configurations.

- **Bug Fixes**
	- Improved error reporting for dynamic library loading issues.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
---
 source/api_cc/src/common.cc              | 8 +++++++-
 source/lib/src/gpu/cudart/cudart_stub.cc | 4 ++++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/source/api_cc/src/common.cc b/source/api_cc/src/common.cc
index c51ae9a8b4..d3cad083bd 100644
--- a/source/api_cc/src/common.cc
+++ b/source/api_cc/src/common.cc
@@ -390,7 +390,13 @@ static inline void _load_library_path(std::string dso_path) {
   if (!dso_handle) {
     throw deepmd::deepmd_exception(
         dso_path +
-        " is not found! You can add the library directory to LD_LIBRARY_PATH");
+        " is not found or fails to load! You can add the library directory to "
+        "LD_LIBRARY_PATH."
+#ifndef _WIN32
+        " Error message: " +
+        std::string(dlerror())
+#endif
+    );
   }
 }
 
diff --git a/source/lib/src/gpu/cudart/cudart_stub.cc b/source/lib/src/gpu/cudart/cudart_stub.cc
index 8083a0a89d..cfbabd6f5e 100644
--- a/source/lib/src/gpu/cudart/cudart_stub.cc
+++ b/source/lib/src/gpu/cudart/cudart_stub.cc
@@ -25,6 +25,10 @@ void *DP_cudart_dlopen(char *libname) {
 #endif
     if (!dso_handle) {
       std::cerr << "DeePMD-kit: Cannot find " << libname << std::endl;
+#ifndef _WIN32
+      std::cerr << "DeePMD-kit: Error message: " << std::string(dlerror())
+                << std::endl;
+#endif
       return nullptr;
     }
     std::cerr << "DeePMD-kit: Successfully load " << libname << std::endl;