move modeling.py and modeling_nv.py to transformers #9676

Li-Z-Q · 2024-12-23T11:26:20Z

move modeling.py and modeling_nv.py to transformers

paddle-bot · 2024-12-23T11:26:25Z

Thanks for your contribution!

codecov · 2024-12-23T11:59:28Z

Codecov Report

Attention: Patch coverage is 17.38149% with 366 lines in your changes missing coverage. Please review.

Project coverage is 52.62%. Comparing base (97ae9ad) to head (86a05c3).
Report is 11 commits behind head on develop.

Files with missing lines	Patch %	Lines
paddlenlp/transformers/nv_embed/modeling.py	15.18%	229 Missing ⚠️
paddlenlp/transformers/llm_embed/modeling.py	18.93%	137 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #9676      +/-   ##
===========================================
+ Coverage    52.00%   52.62%   +0.62%     
===========================================
  Files          721      722       +1     
  Lines       116703   112813    -3890     
===========================================
- Hits         60690    59373    -1317     
+ Misses       56013    53440    -2573

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

DrownFish19 · 2024-12-24T02:31:22Z

Lint 问题需要安装pre-commit 后格式化代码，参考步骤如下：

# 安装
pip install pre-commit

# 在项目文件夹下注册pre-commit，每次commit提交时都会格式化代码
pre-commit install

# 单独处理之前的代码文件
pre-commit run --file XXXX.py

DrownFish19 · 2024-12-24T02:07:16Z

paddlenlp/transformers/llm_embed/__init__.py

@@ -0,0 +1,13 @@
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.


这里的copyright是否正确？

DrownFish19 · 2024-12-24T02:15:58Z

paddlenlp/transformers/llm_embed/__init__.py

+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.


此处增加from .modeling import *

DrownFish19 · 2024-12-24T02:26:20Z

paddlenlp/transformers/llm_embed/modeling.py

+    PretrainedModel,
+)
+from paddlenlp.transformers.model_outputs import ModelOutput
+from paddlenlp.utils.log import logger


这里的调用需要修改为相对引用方式
例如：

from paddlenlp.transformers.model_outputs import ModelOutput

修改为

from ..transformers.model_outputs import ModelOutput

修改为 from ..transformers.model_outputs import ModelOutput 之后会报错，因此暂未修改

此处应该为 from ..model_outputs import ModelOutput ，抱歉之前写错了

DrownFish19 · 2024-12-24T02:26:55Z

paddlenlp/transformers/nv_embed/__init__.py

+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.


同上，此处增加from .modeling import *

DrownFish19 · 2024-12-24T02:27:20Z

paddlenlp/transformers/nv_embed/modeling_nv.py

@@ -0,0 +1,517 @@
+# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.


此文件需要修改名称为modeling.py

DrownFish19 · 2024-12-24T02:28:44Z

slm/pipelines/examples/contrastive_training/evaluation/eval_mteb.py

 from mteb import MTEB

 from paddlenlp.peft import LoRAConfig, LoRAModel
 from paddlenlp.transformers import AutoModel, AutoModelForCausalLM, AutoTokenizer
+from paddlenlp.transformers.llm_embed.modeling import BiEncoderModel
+from paddlenlp.transformers.nv_embed.modeling_nv import NVEncodeModel


此处可以简化导入

from paddlenlp.transformers import BiEncoderModel, NVEncodeModel

DrownFish19 · 2024-12-24T02:28:58Z

slm/pipelines/examples/contrastive_training/train.py


 from paddlenlp.peft import LoRAConfig, LoRAModel
 from paddlenlp.trainer import PdArgumentParser, Trainer, get_last_checkpoint, set_seed
 from paddlenlp.transformers import AutoTokenizer
+from paddlenlp.transformers.llm_embed.modeling import BiEncoderModel
+from paddlenlp.transformers.nv_embed.modeling_nv import NVEncodeModel


Li-Z-Q · 2024-12-25T11:21:48Z

Lint 问题需要安装pre-commit 后格式化代码，参考步骤如下：

# 安装
pip install pre-commit

# 在项目文件夹下注册pre-commit，每次commit提交时都会格式化代码
pre-commit install

# 单独处理之前的代码文件
pre-commit run --file XXXX.py

已按照您所说步骤在commit之前进行了pre-commit

DrownFish19 · 2024-12-25T11:32:06Z

paddlenlp/transformers/nv_embed/modeling.py

+            dtype=str(self.latents.weight.dtype).split(".")[-1],
+        )
+        self_latents_weight_T = self.latents(one).T
+        latents = repeat(self_latents_weight_T, "d h -> b d h", b=last_hidden_states.shape[0])


einops修改为paddle操作
https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/einsum_cn.html

已修改为

latents = paddle.tile(self_latents_weight_T, repeat_times=last_hidden_states.shape[0]).reshape( self_latents_weight_T.shape[0], last_hidden_states.shape[0], self_latents_weight_T.shape[1] ) latents = latents.transpose([1, 0, 2])

DrownFish19 · 2024-12-25T11:32:33Z

paddlenlp/transformers/nv_embed/modeling.py

+        k = kv[:, :, : self.config.max_position_embeddings]
+        v = kv[:, :, self.config.max_position_embeddings :]
+
+        q, k, v = map(lambda t: rearrange(t, "b n (h d) -> b n h d", h=self.config.num_key_value_heads), (q, k, v))


rearrange辛苦换为paddle算子

DrownFish19 · 2024-12-25T11:32:45Z

paddlenlp/transformers/nv_embed/modeling.py

+        # v.stop_gradient = False
+        # out = paddle.nn.functional.scaled_dot_product_attention(q, k, v) # if use this, must set k and v stop_gradient to False
+        out = scaled_dot_product_attention(q, k, v)  # if use this, no need to manually set k and v
+        out = rearrange(out, "b n h d -> b n (h d)", h=self.config.num_key_value_heads)


同上修改

…rt path

move modeling.py and modeling_nv.py to transformers

226a298

paddle-bot bot added the contributor label Dec 23, 2024

paddle-bot bot assigned KB-Ding Dec 23, 2024

DrownFish19 added the Beijing Innovation Consortium label Dec 24, 2024

DrownFish19 assigned DrownFish19 and unassigned KB-Ding Dec 24, 2024

DrownFish19 reviewed Dec 24, 2024

View reviewed changes

Li-Z-Q added 2 commits December 24, 2024 16:50

move modeling.py and modeling_nv.py to transformers

bfc29f2

move modeling.py and modeling_nv.py to transformers

bce5e1c

DrownFish19 reviewed Dec 25, 2024

View reviewed changes

Li-Z-Q added 2 commits December 26, 2024 19:23

move modeling.py and modeling_nv.py to transformers and replace einops

460fb56

move modeling.py and modeling_nv.py to transformers and simplify impo…

86a05c3

…rt path

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

move modeling.py and modeling_nv.py to transformers #9676

move modeling.py and modeling_nv.py to transformers #9676

Li-Z-Q commented Dec 23, 2024

paddle-bot bot commented Dec 23, 2024

codecov bot commented Dec 23, 2024 •

edited

Loading

DrownFish19 commented Dec 24, 2024 •

edited

Loading

DrownFish19 Dec 24, 2024

Li-Z-Q Dec 24, 2024

DrownFish19 Dec 24, 2024

Li-Z-Q Dec 24, 2024

DrownFish19 Dec 24, 2024

Li-Z-Q Dec 24, 2024 •

edited

Loading

DrownFish19 Dec 25, 2024

Li-Z-Q Dec 26, 2024

DrownFish19 Dec 24, 2024

Li-Z-Q Dec 24, 2024

DrownFish19 Dec 24, 2024

Li-Z-Q Dec 24, 2024

DrownFish19 Dec 24, 2024

Li-Z-Q Dec 24, 2024

DrownFish19 Dec 24, 2024

Li-Z-Q Dec 24, 2024

Li-Z-Q commented Dec 25, 2024

DrownFish19 Dec 25, 2024

Li-Z-Q Dec 26, 2024

DrownFish19 Dec 25, 2024

Li-Z-Q Dec 26, 2024

DrownFish19 Dec 25, 2024

Li-Z-Q Dec 26, 2024

		@@ -0,0 +1,13 @@
		# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.

		@@ -0,0 +1,517 @@
		# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.

move modeling.py and modeling_nv.py to transformers #9676

Are you sure you want to change the base?

move modeling.py and modeling_nv.py to transformers #9676

Conversation

Li-Z-Q commented Dec 23, 2024

paddle-bot bot commented Dec 23, 2024

codecov bot commented Dec 23, 2024 • edited Loading

Codecov Report

DrownFish19 commented Dec 24, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Li-Z-Q Dec 24, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Li-Z-Q commented Dec 25, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Dec 23, 2024 •

edited

Loading

DrownFish19 commented Dec 24, 2024 •

edited

Loading

Li-Z-Q Dec 24, 2024 •

edited

Loading