Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can it support Inspector protocol? #63

Open
Taiung opened this issue May 15, 2024 · 8 comments
Open

Can it support Inspector protocol? #63

Taiung opened this issue May 15, 2024 · 8 comments

Comments

@Taiung
Copy link

Taiung commented May 15, 2024

I want to debug when python executes js code. I checked some documents of V8, but I am not familiar with C++ and cannot implement this function myself. So, I would like to ask if this feature can be added in subsequent updates.

@bpcreech
Copy link
Owner

bpcreech commented May 25, 2024

Hey, interesting idea!

The v8 inspector protocol is pretty extensive! I wonder if you could describe what kinds of things you want to do with it? (E.g., print values, change values, pause execution, profile memory, ...) Also, what kind of interface were you thinking of (E.g., just expose a JSON sendInspectorMessage and onInspectorMessage, or something more user-friendly?)

It might be noteworthy that as of PyMiniRacer v0.12.0, you can now make callbacks from JavaScript to Python, which enables crude "print debugging", like this:

$ python
>>> from py_mini_racer import MiniRacer
>>> ctx = MiniRacer()
>>> async def log(s):
...   print(s)
...
>>> async def run_my_code():
...   async with ctx.wrap_py_function(log) as log_js:
...     ctx.eval('this')['log'] = log_js
...     ctx.eval('for (let i = 0; i < 10; i++) { log(i); }')
...
>>> import asyncio
>>> asyncio.run(run_my_code())
0
1
2
3
4
5
...

@Taiung
Copy link
Author

Taiung commented Jul 12, 2024

I'm very sorry that I'm replying to your message just now. What I hope is that I can debug the js code with the help of Chrome's DevTools, and I can pause the program where needed, similar to Pycharm's debugging mode.

I describe it like this, I don't know if it is clear. This feature is great for developers, and I hope you can adopt this suggestion.

@christian2022
Copy link

@bpcreech I have looked at your example and have the problem that the wrapped function is not called immediately but only when the async contextmanager exits:

$ python
>>> from py_mini_racer import MiniRacer
>>> ctx = MiniRacer()
>>> async def log(s):
...   print(s)
...
>>> async def run_my_code():
...   async with ctx.wrap_py_function(log) as log_js:
...     ctx.eval('this')['log'] = log_js
...     print('before loop')
...     ctx.eval('for (let i = 0; i < 10; i++) { log(i); }')
...     print('after loop')
...   print('after async with')
...
>>> import asyncio
>>> asyncio.run(run_my_code())
before loop
after loop
0
1
2
...
9
after async with

The expectation would be that after loop would be printed after 9 and not before 0. I assume these callbacks are somehow stuck in a loop and get executed when the context is cleaned up.
Is that a bug or am I overseeing something?

@bpcreech
Copy link
Owner

Ah, hmm, I think that's either a bug or expectation gap in the code. Because the JS code doesn't await the return value of log, it's basically a race condition if the thing completes or not. If you add await asyncio.sleep(0.5) before the async contextmanager exits, it does print in the expected order.

If we wanted more deterministic logging we'd need to do something like:

import asyncio
from py_mini_racer import MiniRacer
ctx = MiniRacer()
async def log(s):
  print(s)

async def run_my_code():
  async with ctx.wrap_py_function(log) as log_js:
    ctx.eval('this')['log'] = log_js
    print('before loop')
    await ctx.eval('Promise.all(Array(10).keys().map(i => log(i)))')
    print('after loop')
  print('after async with')

asyncio.run(run_my_code())

@christian2022
Copy link

Not sure if Promise.all will even ensure processing in proper sequence. But anyways if you want to bind e.g. console.log to python (or any other sync function) you cannot use an async function as sync-over-async is bound to fail. So to make that work properly I think two wrap_py_functions are needed - for async and sync.

@bpcreech
Copy link
Owner

bpcreech commented Jul 24, 2024

The sequence isn't relevant in this example since the values are ignored. The purpose here is simply to ensure the function is completed.

You are right that this is not a drop-in replacement for console.log. it is, however, a functioning way to get logs!

The trouble with a sync wrap_py_function is that it would very quickly deadlock when the wrapped function attempted to call back into v8 for anything including even unpacking an object or array.

@christian2022
Copy link

The reason I came here was, because I was trying to intercept document.write. I'm parsing web pages with BeautifulSoup and execute the script nodes with MiniRacer. As soon as one of the scripts use document.write, I'd have to adapt the document object in python and execute added script nodes. But these need to be executed before the next script in the original document, so I cannot wait until the async contextmanager exits its scope.
Any idea?

@bpcreech
Copy link
Owner

Neat use case! You are sort of building a headless web browser. :)

So we want to serialize operations here while avoiding the world of multithreaded recursion... IIUC we want our Python outer loop (which is running BeautifulSoup) to ensure any calls to document.write have completed before it continues to the next part of the document. Is it possible to create a JS function which calls the Python document writer and stores the promise in a JS global, and use that function as document.write? Then the Python outer loop can find that global and await all the promises in it before proceeding to processing the next HTML fragment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants