Skip to content

hugo-quantmetry/pandas-timeout

Repository files navigation

pdtimeout

Documentation Status

Generate TimeOut errors with pandas.apply

Basic usage : Raise a TimeOut Error

  • Define a pandas DataFrame
df = pd.DataFrame({'number': [1, 0.5, 0.2, 2, 0.3]}, index=[0, 2, 4, 6, 8])
Index Number
0 1
2 0.5
4 0.2
6 2
8 0.3
  • Define a function to apply on the DataFrame and set the timeout value
@timeout(4)
 def sleep_and_triple(row):
     number_ = row['number']

     time.sleep(number_)

     return number_ * 3

This function first sleeps for number seconds and then returns the triple of the input value. Since the highest number is 2, a timeout value of 4 should not trigger a TimeOut error.

  • Apply function on DataFrame
df['result'] = df.apply(sleep_and_halve, axis=1)
print(df)
Index Number Result
0 1 3
2 0.5 1.5
4 0.2 0.6
6 2 6
8 0.3 0.9
  • Change the timeout value to 1.7 seconds and re-apply function on DataFrame
@timeout(1.7)
def sleep_and_triple(row):
    number_ = row['number']

    time.sleep(number_)

    return number_ * 3

df['result'] = df.apply(sleep_and_halve, axis=1)
print(df)

The following TimeOut error is triggered:

>>> "TimeoutError: ('Time expired', 'occurred at index 6')"

The row index (pandes .loc) of the row triggering the TimeOut error is given in the error message. The row with index 6 sleeps for 2 seconds which is longer than the timeout value, thus the error is triggered.

Return a default value in case of timeout

@timeout(1, replace_value='TimeOut')
def sleep_and_triple(row):
    number_ = row['number']

    time.sleep(number_)

    return number_ * 3

df['result'] = df.apply(sleep_and_halve, axis=1)
print(df)
Index Number Result
0 1 TimeOut
2 0.5 1.5
4 0.2 0.6
6 2 TimeOut
8 0.3 0.9

Return the execution time for each row

The time_apply can be used to monitor the execution time of each row as follows:

@time_apply()
def sleep_and_triple(row):
    number_ = row['number']

    time.sleep(number_)

    return number_ * 3

df['result'] = df.apply(sleep_and_halve, axis=1)
print(df)
Index Number Result
0 1 1.000667
2 0.5 0.500193
4 0.2 0.205290
6 2 2.005164
8 0.3 0.301278

The returned value (number * 3) is replaced by the execution time for the associated row.

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

About

Generate TimeOut errors with pandas.apply

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published