Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LLM] Add deepseekv2 #9061

Open
wants to merge 17 commits into
base: develop
Choose a base branch
from

Conversation

DrownFish19
Copy link
Collaborator

PR types

New features

PR changes

Models

Description

Add DeepSeekV2.

Copy link

paddle-bot bot commented Aug 30, 2024

Thanks for your contribution!

Copy link

codecov bot commented Aug 30, 2024

Codecov Report

Attention: Patch coverage is 14.64435% with 816 lines in your changes missing coverage. Please review.

Project coverage is 52.91%. Comparing base (8212b53) to head (c33429e).

Files with missing lines Patch % Lines
paddlenlp/transformers/deepseek_v2/modeling.py 14.15% 764 Missing ⚠️
...addlenlp/transformers/deepseek_v2/configuration.py 13.04% 40 Missing ⚠️
...ddlenlp/transformers/deepseek_v2/tokenizer_fast.py 29.41% 12 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #9061      +/-   ##
===========================================
- Coverage    53.26%   52.91%   -0.35%     
===========================================
  Files          652      655       +3     
  Lines       105615   106571     +956     
===========================================
+ Hits         56254    56394     +140     
- Misses       49361    50177     +816     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment on lines 161 to 171
if version != "0.0.0" and version <= "2.5.2":
attn_output, attn_weights = flash_attention(
query_states,
key_states,
value_states,
causal=True,
return_softmax=output_attentions,
)
attn_output *= (head_dim ** (0.5)) * softmax_scale
attn_weights *= (head_dim ** (0.5)) * softmax_scale
else:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个分支干掉吧,现在应该不需要了。可以统一清理

if config.sequence_parallel and use_sequence_parallel:
mark_as_sequence_parallel_parameter(self.weight)

def forward(self, hidden_states):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fuse_rms_norm 可以考虑支持

@CLAassistant
Copy link

CLAassistant commented Sep 19, 2024

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ DrownFish19
❌ Mangodadada
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants