Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better error handling for calls to Lua functions from Python #189

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

guidanoli
Copy link
Contributor

@guidanoli guidanoli commented Jul 22, 2021

Problem 1: error objects and stack tracebacks

Currently, when you call a Lua function from Python, it uses the debug.traceback message handler, which adds the stack traceback to the error message string. This pollutes the original error message. In these circumstances, if you are handling Lua errors from Python, you need to search for a substring instead of a cleaner equality check. So, we need to keep the error object intact. Well, how are we going to add the Lua traceback to the LuaError exception? Well, we can convert the Lua traceback (which is obtainable via the Lua debug library) into Python traceback objects and link them nicely.

Solution: add a message handler that creates a Python exception according to the error object and adds a Python stack traceback extracted from the Lua debug library (lua_getstack and lua_getinfo). When calling Python functions from Lua, the exception information (obtained from sys.exc_info) is stored inside an instance of the (new) _PyException class, which is wrapped in a Lua userdatum.

>>> lua.execute('error("spam")')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>                                  # <<< Python traceback
  File "lupa/_lupa.pyx", line 335, in lupa._lupa.LuaRuntime.execute
  File "lupa/_lupa.pyx", line 1669, in lupa._lupa.run_lua
  File "lupa/_lupa.pyx", line 1683, in lupa._lupa.call_lua
  File "lupa/_lupa.pyx", line 1708, in lupa._lupa.execute_lua_call
  File "lupa/_lupa.pyx", line 1651, in lupa._lupa.py_from_lua_error
  File "[C]", line 1, in <module>                                      # <<< Lua traceback
  File "[string "<python>"]", line 1, in <module>
lupa._lupa.LuaError: [string "<python>"]:1: spam

Problem 2: error re-raising is not re-entrant

Currently, when you call a Python function from Lua, and it raises a Python exception, it is converted to a Lua error and stored in _raised_exception inside the LuaRuntime instance. It is easy to see that this solution is not re-entrant, that is, it doesn't work for arbitrarily recursive calls between Lua and Python. So, instead of storing exception information (which includes the stack traceback) in the LuaRuntime instance, we need to propagate exception information via the error object, which is unique to each protected call in Lua.

Solution: Handle _PyException and BaseException instances raised from protected calls to Lua functions from Python.

#   1         2            3          4
>>> lua.eval('python.eval("lua.eval(\'python.eval([[0/0]])\')")')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "lupa/_lupa.pyx", line 327, in lupa._lupa.LuaRuntime.eval
    return run_lua(self, b'return ' + lua_code, args)
  File "lupa/_lupa.pyx", line 1669, in lupa._lupa.run_lua
    return call_lua(runtime, L, args)                                # <<< Lua call (1)
  File "lupa/_lupa.pyx", line 1683, in lupa._lupa.call_lua
    return execute_lua_call(runtime, L, len(args))
  File "lupa/_lupa.pyx", line 1708, in lupa._lupa.execute_lua_call
    py_from_lua_error(runtime, L, result_status)
  File "lupa/_lupa.pyx", line 1651, in lupa._lupa.py_from_lua_error
    raise pyexc.etype, pyexc.value, pyexc.traceback
  File "lupa/_lupa.pyx", line 1879, in lupa._lupa.py_call_with_gil
    return call_python(runtime, L, py_obj)                             # <<< Python call (2)
  File "lupa/_lupa.pyx", line 1866, in lupa._lupa.call_python
    result = f(*args, **kwargs)
  File "<string>", line 1, in <module>
  File "lupa/_lupa.pyx", line 327, in lupa._lupa.LuaRuntime.eval
    return run_lua(self, b'return ' + lua_code, args)
  File "lupa/_lupa.pyx", line 1669, in lupa._lupa.run_lua
    return call_lua(runtime, L, args)                                # <<< Lua call (3)
  File "lupa/_lupa.pyx", line 1683, in lupa._lupa.call_lua
    return execute_lua_call(runtime, L, len(args))
  File "lupa/_lupa.pyx", line 1708, in lupa._lupa.execute_lua_call
    py_from_lua_error(runtime, L, result_status)
  File "lupa/_lupa.pyx", line 1651, in lupa._lupa.py_from_lua_error
    raise pyexc.etype, pyexc.value, pyexc.traceback
  File "lupa/_lupa.pyx", line 1879, in lupa._lupa.py_call_with_gil
    return call_python(runtime, L, py_obj)                             # <<< Python call (4)
  File "lupa/_lupa.pyx", line 1866, in lupa._lupa.call_python
    result = f(*args, **kwargs)
  File "<string>", line 1, in <module>
ZeroDivisionError: division by zero

Problem 3: clearing the stack

I never understood why Lupa clears the stack before it calls a Lua function from Python or vice versa. The Lua stack can be indexed either from the bottom (positive) and from the top (negative), which makes manipulating only the top n values very easy.

Solution: Use negative indices to navigate through the top-most values in the Lua stack.

Problem 4: type checking Python objects from Lua

Thanks to python.builtins.type the user is able to check the type of Python objects from Lua. However, this does not tell whether the object is a wrapped Python object or not. Ergo, python.builtins.type(nil) and python.builtins.type(python.none) output the same type, NoneType.

Solution: Add python.is_object for checking if a Lua value is a wrapped Python object or not

>>> lua.eval('python.is_object(nil)')
False
>>> lua.eval('python.is_object(python.none)')
True

Additional changes

  • Add python.is_error for checking if a Lua value is a wrapped _PyException instance
>>> lua.execute('''
... local ok, err = pcall(python.eval, '0/0')
... assert(not ok, "raises an error")
... assert(python.is_error(err), "raises Python error")
... return err.etype, err.value, err.traceback''')
(<class 'ZeroDivisionError'>, ZeroDivisionError('division by zero'), <traceback object at 0x7f5eba484e80>)
  • Restore the original lock_runtime (before Safer handling of the Lua stack #188) and add try_lock_runtime (returns boolean)
  • Simplify _LuaObject.__dealloc__ and _LuaIter.__dealloc__ (it's OK to call luaL_unref with LUA_NOREF or LUA_REFNIL)
  • Check the stack of the main Lua thread before calling lua_xmove in resume_lua_thread
  • Add tests for new features and adjust old tests for new behaviour of error handling
  • Add documentation for error handling in README

@guidanoli
Copy link
Contributor Author

guidanoli commented Jul 22, 2021

@scoder I think these are really important changes, could you review it?

* Added section in README about handling errors
* LuaError raised in Lua is not "unpacked", but kept as _PyException to
preserve traceback and original value
Copy link
Owner

@scoder scoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice.

One issue is that it seems to become less clear when exceptions and Python frames (which can hold on to arbitrarily large amounts of data) are getting cleaned up – and if at all. Cannot say if that's a real problem in practice.

lupa/_lupa.pyx Outdated Show resolved Hide resolved
Comment on lines +1639 to +1640
elif isinstance(err, BaseException):
raise err
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behaviour wasn't there before, right? Why should there be a Python exception on the stack? And why should we special-case it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we only reach here if result is not 0, that means that an error occurred and that the error object is on top of the stack. In this case, the error object is a wrapped Python object. We check if it is a base exception so we can simply re-raise it. This accounts for the following use case:

-- in Lua
function raiseKeyError(s) error(python.builtins.KeyError(s)) end

... from Python ...

lua.globals().raiseKeyError("something")  # raises KeyError("something")

lupa/_lupa.pyx Outdated Show resolved Hide resolved
"__file__": filename,
"__lupa_exception__": exc,
}
code = compile("\n" * (lineno - 1) + "raise __lupa_exception__", filename, "exec")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a fairly heavy operation, especially when done multiple times. Can't we reuse the result somehow?

Cython creates its own code+frame objects as well, using the C-API. I think we can do it similarly. Note that only the frame really needs to know the line number in the end, less so the code object.

Copy link
Contributor Author

@guidanoli guidanoli Sep 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, the __Pyx_AddTraceback function does something with Frame and Code objects. I will look into that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I took a look at the code generated by Cython, and it's very messy. I think this solution in pure Python is more portable (although not optimal). Could we leave this improvement for a future PR? It's not a big performance bottleneck because, unless someone is constantly throwing and catching errors in a tight loop, code objects and frames will be built on rare occasions.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a related discussion going on on python-dev, so I asked and there seems to be general interest in improving the C-API for this. Not something for right now, but once that's available and used (and backported) by Cython, Lupa (and probably several other projects that handle foreign languages, DSLs, templates, …) could benefit from it.

https://mail.python.org/archives/list/[email protected]/thread/ZWTBR5ESYR26BUIVMXOKPFRLGGYDJSFC/#XYNNMH57O7CYWHYKTD3ELZTM3B4M53HL

lupa/tests/test.py Outdated Show resolved Hide resolved
@guidanoli
Copy link
Contributor Author

One issue is that it seems to become less clear when exceptions and Python frames (which can hold on to arbitrarily large amounts of data) are getting cleaned up – and if at all. Cannot say if that's a real problem in practice.

So, I've tested the garbage collector of Lua and Python submitted to many errors with the following script.
It throws 1000 Lua errors, convert them to Python (with frames and code objects). Then it displays a graph of the number of Python objects and allocated space by Lua.

import gc
import lupa

pygcdata = []
luagcdata = []

def checkgc():
    luagccount = checkluagc()
    while gc.collect() != 0:
        # collect garbage from Python and Lua
        luagccount = checkluagc()
    # print Python object count
    pygcdata.append(len(gc.get_objects()))
    luagcdata.append(luagccount*1024)

def showgcdata():
    import matplotlib.pyplot as plt
    plt.title('GC utilization by Lua and Python')
    plt.plot(pygcdata, label='Python (#objects)')
    plt.plot(luagcdata, label='Lua (bytes)')
    plt.legend()
    plt.show()

lua = lupa.LuaRuntime()

checkluagc = lua.eval('''function()
    while true do
        before = collectgarbage('count')
        collectgarbage()
        after = collectgarbage('count')
        if before == after then
            return after
        end
    end
end''')

s = 'error()'

checkgc()

for i in range(1000):
    try: lua.eval(s)
    except: pass
    finally: checkgc()

showgcdata()

Here are the results: Imgur
We have detected no memory leakage whatsoever.

@guidanoli
Copy link
Contributor Author

guidanoli commented Sep 3, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants