Extremely long time taken for comparison of codes #10

jackswl · 2024-08-30T09:17:41Z

Hi all,

Thanks for the wonderful work.

I am currently running code_bert_score to evaluate the similarity between generated code and 'correct' code. However, it just takes way too long locally. Is there a way for it to speedup (i.e. using GPU or something) on MacOS? Are you able to let me know where I can optimize the code? Is there some specific settings I have to update for the code to run faster? Thanks!

import code_bert_score
import pandas as pd

rp_values = [1]

for rp in rp_values:
    CSV_PATH = f'xxx'
    codebertdf = pd.read_csv(CSV_PATH)

    codebertdf['generated_output'] = codebertdf['generated_output'].str.strip()

    predictions = codebertdf['generated_output'].tolist()
    refs = codebertdf['actual_output'].tolist()
    
    # Calculate BERT scores
    P, R, F3, F1 = code_bert_score.score(cands=predictions, refs=refs, lang='python')
    
    # Add scores to DataFrame
    codebertdf['P'] = P
    codebertdf['R'] = R
    codebertdf['F3'] = F3
    codebertdf['F1'] = F1
    
    # Export DataFrame
    codebertdf.to_csv(f'/xxx', index=False)

The text was updated successfully, but these errors were encountered:

urialon · 2024-08-30T11:07:49Z

Hi Jack, Thank you for your interest in our work! Yes, a GPU will definitely speed this up. You can also use a Google colab with a GPU. Best, Uri

…

On Fri, Aug 30, 2024 at 05:18 Jack Shi Wei Lun ***@***.***> wrote: Hi all, Thanks for the wonderful work. I am currently running code_bert_score to evaluate the similarity between generated code and 'correct' code. However, it just takes way too long locally. Is there a way for it to speedup (i.e. using GPU or something) on MacOS? Are you able to let me know where I can optimize the code? Is there some specific settings I have to update for the code to run faster? Thanks! import code_bert_score import pandas as pd rp_values = [1] for rp in rp_values: CSV_PATH = f'xxx' codebertdf = pd.read_csv(CSV_PATH) codebertdf['generated_output'] = codebertdf['generated_output'].str.strip() predictions = codebertdf['generated_output'].tolist() refs = codebertdf['actual_output'].tolist() # Calculate BERT scores P, R, F3, F1 = code_bert_score.score(cands=predictions, refs=refs, lang='python') # Add scores to DataFrame codebertdf['P'] = P codebertdf['R'] = R codebertdf['F3'] = F3 codebertdf['F1'] = F1 # Export DataFrame codebertdf.to_csv(f'/xxx', index=False) — Reply to this email directly, view it on GitHub <#10>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADSOXMEW2TTSWEX6UHZPGLLZUA2EVAVCNFSM6AAAAABNMCVKSOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGQ4TMNRYGYZTGNI> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

jackswl · 2024-08-30T11:10:59Z

Hi @urialon, great work again.

Yeah, using a google Colab (with Cuda GPU) will speed it up. However, I am intending to run it locally on my macbook.

Is there any settings I should do for my code above, for the code_bert_score to somehow tap into my macbook m3 chips? The time taken to compute under 'mps' and 'cpu' is the same. Or is running the comparison not possible using apple silicon chips?

Thanks!

jackswl changed the title ~~Comparison of generated codes and ground truth codes take way too long~~ Extremely long time taken for comparison of codes Aug 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extremely long time taken for comparison of codes #10

Extremely long time taken for comparison of codes #10

jackswl commented Aug 30, 2024

urialon commented Aug 30, 2024 via email

jackswl commented Aug 30, 2024 •

edited

Loading

Extremely long time taken for comparison of codes #10

Extremely long time taken for comparison of codes #10

Comments

jackswl commented Aug 30, 2024

urialon commented Aug 30, 2024 via email

jackswl commented Aug 30, 2024 • edited Loading

jackswl commented Aug 30, 2024 •

edited

Loading