spectool: Execute JSON test cases #245

tomusdrw · 2025-01-02T11:37:16Z

This PR attempts to implement the main_test function from spectool.

The majority of content of main.rs was moved to a new file generate.rs to avoid cluttering the main file.
test.rs module was created to implement JSON test vectors execution.

Some open issues:

~~While running a bunch of generated test vectors from here I've noticed that the interpreter may hang instead of going OOG.~~
I'm not sure if the way I'm handling the memory input is correct (see setup_memory function) - I didn't find a more generic API for this.

~~Example program that hangs: https://pvm-debugger.netlify.app/#/load?program=0x0000015101 (BRANCH_EQ_IMM).~~
Edit: Apologies, the program has an extremely large gas and has an edge case in parameters parsing. Our implementations were rejecting that, while polkavm correctly attempts to execute it.

koute · 2025-01-06T06:38:03Z

Thanks for the PR!

Hmm... I would want to at least wait for #238 to get in before merging this, although, that said, it might be better to not actually merge this here and instead finally start on w3f/jamtestvectors#7?

tomusdrw · 2025-01-06T10:42:00Z

I think waiting for #238 makes a lot of sense, that's the only context I'm interested in the runner actually.

Common test harness definitely makes sense, however I still think that JSON-based test case specification is and will be valuable. I imagine a following workflow (correct me if I'm missing something):

Detect some mismatching behaviour using a more complex fuzzing / multi-PVM test harness.
Prepare a JSON test case for this as an easily shareable "test case specification".
Use (already existing) PVM-A and PVM-B JSON test runner to debug the JSON test case.

Hence, being able to run that single JSON case on PolkaVM would be something that this PR enables. Contrasting this with running a presumably more complex test harness machinery just to debug a single test case.
However I might biased towards solutions that already exist and long term this might not be needed at all, since we will figure out better spec that could be used to reproduce the test.

koute · 2025-01-06T11:44:13Z

Common test harness definitely makes sense, however I still think that JSON-based test case specification is and will be valuable.

Well, isn't this orthogonal to having a test harness? (: The harness can still take the JSON test cases, just potentially run them on any PVM, not only on PolkaVM. In other words - I don't see why we should limit this to only PolkaVM - anyone interested should be able to use exactly the same code to load up the tests and run them on their PVM if they so desire. This wouldn't necessarily make it much more complex if we'd wrap the PolkaVM API in a trait (which can just have the same/very similar API to PolkaVM).

tomusdrw · 2025-01-06T12:14:16Z

Well, isn't this orthogonal to having a test harness?

Yes! :)

The harness can still take the JSON test cases, just potentially run them on any PVM, not only on PolkaVM.

That's what I meant with "presumably more complex test harness machinery" :) But I agree it doesn't necessarily have to be more complex.

My assumption is that most of the PVM implementers already have a setup to load JSON test cases for their PVM and my point is that it's currently missing for PolkaVM. As mentioned: I'm probably biased towards grabbing a low hanging fruit, since all other parts already exist, but if you prefer to rather focus on the test harness development I'll humbly follow too :) I'll add some discussion points to the harness issue to push the ball a bit.

koute · 2025-01-06T12:47:57Z

My assumption is that most of the PVM implementers already have a setup to load JSON test cases for their PVM and my point is that it's currently missing for PolkaVM.

Hm, I guess yeah, this is a good point. (:

But since PolkaVM itself is used to generate those test vectors having such a loader is not as useful as it is for other PVMs. Of course it's still useful to guard against regressions, but it's probably more sensible to just go all the way and start on the harness, since we do want to have a harness anyway, and we will want to integrate PolkaVM into the harness natively anyway for fast differential fuzzying.

tomusdrw · 2025-01-20T19:59:33Z

@koute let me know if/when you'd like me to get this updated to the latest master or rather closed in favour of w3f/jamtestvectors#7
In case you missed my response in the other thread, I've put together https://github.com/FluffyLabs/pvm-test-harness and I'm also curious if you think it's a step in the right direction.

koute · 2025-01-22T08:20:40Z

In case you missed my response in the other thread, I've put together https://github.com/FluffyLabs/pvm-test-harness and I'm also curious if you think it's a step in the right direction.

Yes, I'm most likely going to contribute there.

Could you hook up your PVM in there too?

tomusdrw added 6 commits January 2, 2025 08:45

Split generate and test to separate mods.

5a28ca6

Implement running of JSON test files.

edb2a7d

Add some logs.

e5f28c5

Fix error handling.

b4671a3

Fix clippy and actually compare memory.

f52b2dc

Remove dumbass code.

dd4d3e1

Merge branch 'master' into td-run-spec

0e8e999

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spectool: Execute JSON test cases #245

spectool: Execute JSON test cases #245

tomusdrw commented Jan 2, 2025 •

edited

Loading

koute commented Jan 6, 2025

tomusdrw commented Jan 6, 2025 •

edited

Loading

koute commented Jan 6, 2025

tomusdrw commented Jan 6, 2025

koute commented Jan 6, 2025

tomusdrw commented Jan 20, 2025

koute commented Jan 22, 2025

spectool: Execute JSON test cases #245

Are you sure you want to change the base?

spectool: Execute JSON test cases #245

Conversation

tomusdrw commented Jan 2, 2025 • edited Loading

koute commented Jan 6, 2025

tomusdrw commented Jan 6, 2025 • edited Loading

koute commented Jan 6, 2025

tomusdrw commented Jan 6, 2025

koute commented Jan 6, 2025

tomusdrw commented Jan 20, 2025

koute commented Jan 22, 2025

tomusdrw commented Jan 2, 2025 •

edited

Loading

tomusdrw commented Jan 6, 2025 •

edited

Loading