Skip to content

Latest commit

 

History

History
171 lines (159 loc) · 5.12 KB

File metadata and controls

171 lines (159 loc) · 5.12 KB

Results of Code LLama 34B on HumanEval/133

Date: 01 September, 2023
Model: codellama-34b-instruct
Temperature: 0.2
Passing runs: 0
Total runs: 10
Success rate: 0%

Prompt

def sum_squares(lst):
    """You are given a list of numbers.
    You need to return the sum of squared numbers in the given list,
    round each element in the list to the upper int(Ceiling) first.
    Examples:
    For lst = [1,2,3] the output should be 14
    For lst = [1,4,9] the output should be 98
    For lst = [1,3,5,7] the output should be 84
    For lst = [1.4,4.2,0] the output should be 29
    For lst = [-2.4,1,1] the output should be 6
    

    """

The above is an incomplete Python code fragment. Return the complete and correct code with no additional text.

Correct solution

def sum_squares(lst):
    """You are given a list of numbers.
    You need to return the sum of squared numbers in the given list,
    round each element in the list to the upper int(Ceiling) first.
    Examples:
    For lst = [1,2,3] the output should be 14
    For lst = [1,4,9] the output should be 98
    For lst = [1,3,5,7] the output should be 84
    For lst = [1.4,4.2,0] the output should be 29
    For lst = [-2.4,1,1] the output should be 6
    

    """import math
    squared = 0
    for i in lst:
        squared += math.ceil(i)**2
    return squared

Unit tests

def check(candidate):

    # Check some simple cases
    assert candidate([1,2,3])==14, "This prints if this assert fails 1 (good for debugging!)"
    assert candidate([1.0,2,3])==14, "This prints if this assert fails 1 (good for debugging!)"
    assert candidate([1,3,5,7])==84, "This prints if this assert fails 1 (good for debugging!)"
    assert candidate([1.4,4.2,0])==29, "This prints if this assert fails 1 (good for debugging!)"
    assert candidate([-2.4,1,1])==6, "This prints if this assert fails 1 (good for debugging!)"

    assert candidate([100,1,15,2])==10230, "This prints if this assert fails 1 (good for debugging!)"
    assert candidate([10000,10000])==200000000, "This prints if this assert fails 1 (good for debugging!)"
    assert candidate([-1.4,4.6,6.3])==75, "This prints if this assert fails 1 (good for debugging!)"
    assert candidate([-1.4,17.9,18.9,19.9])==1086, "This prints if this assert fails 1 (good for debugging!)"


    # Check some edge cases that are easy to work out by hand.
    assert candidate([0])==0, "This prints if this assert fails 2 (also good for debugging!)"
    assert candidate([-1])==1, "This prints if this assert fails 2 (also good for debugging!)"
    assert candidate([-1,1,0])==2, "This prints if this assert fails 2 (also good for debugging!)"

Runs

❌ Run 1

Start time: 2023-09-01 02:46:26
Duration: 2.75s
Result: Fail

def sum_squares(lst):
    return sum(int(x)**2 for x in lst)

❌ Run 2

Start time: 2023-09-01 02:49:23
Duration: 2.2s
Result: Fail

def sum_squares(lst):
    return sum(int(x)**2 for x in lst)

❌ Run 3

Start time: 2023-09-01 02:52:22
Duration: 4.35s
Result: Fail

def sum_squares(lst):
    return sum(int(x)**2 for x in lst)

❌ Run 4

Start time: 2023-09-01 02:55:13
Duration: 11.91s
Result: Fail

def sum_squares(lst):
    return sum(int(x)**2 for x in lst)

❌ Run 5

Start time: 2023-09-01 02:58:12
Duration: 4.98s
Result: Fail

def sum_squares(lst):
    return sum(int(x)**2 for x in lst)

❌ Run 6

Start time: 2023-09-01 03:01:17
Duration: 9.1s
Result: Fail

def sum_squares(lst):
    """You are given a list of numbers.
    You need to return the sum of squared numbers in the given list,
    round each element in the list to the upper int(Ceiling) first.
    Examples:
    For lst = [1,2,3] the output should be 14
    For lst = [1,4,9] the output should be 98
    For lst = [1,3,5,7] the output should be 84
    For lst = [1.4,4.2,0] the output should be 29
    For lst = [-2.4,1,1] the output should be 6
    

    """
    return sum(int(x)**2 for x in lst)

❌ Run 7

Start time: 2023-09-01 03:04:14
Duration: 14.3s
Result: Fail

def sum_squares(lst):
    """You are given a list of numbers.
    You need to return the sum of squared numbers in the given list,
    round each element in the list to the upper int(Ceiling) first.
    Examples:
    For lst = [1,2,3] the output should be 14
    For lst = [1,4,9] the output should be 98
    For lst = [1,3,5,7] the output should be 84
    For lst = [1.4,4.2,0] the output should be 29
    For lst = [-2.4,1,1] the output should be 6
    

    """
    return sum(int(x)**2 for x in lst)

❌ Run 8

Start time: 2023-09-01 03:07:18
Duration: 2.95s
Result: Fail

def sum_squares(lst):
    return sum(int(x)**2 for x in lst)

❌ Run 9

Start time: 2023-09-01 03:10:20
Duration: 2.83s
Result: Fail

def sum_squares(lst):
    return sum(int(x)**2 for x in lst)

❌ Run 10

Start time: 2023-09-01 03:13:19
Duration: 4.18s
Result: Fail

def sum_squares(lst):
    return sum(int(x)**2 for x in lst)