A problem with Testing failures. Proposed solution: Change `fail` keyword to use it with expressions. #961

perseus-algol · 2024-06-07T23:15:42Z

perseus-algol
Jun 7, 2024

Aiken language tests framework designed in a way that it has no assert* methods. Instead test will be considered successful if it returns True. Which is convenient for tests that expected to succeed. For tests that must fail, aiken provides fail keyword.

The problem

And the problem here that fail keyword is designed to mark the whole test with it. It is not possible to check one specific expression for failing.

Why it is a problem?

Lets consider a test that must fail. We mark it with fail keyword. A test body could contain some preparation code and then target expression itself. Our goal is to ensure that the target expression (and only it) fails, the preparation phase expected to successfully executed. The problem with fail keyword is that the whole test will fail if ANY expression fails, for example expect ... expression in the preparation phase, which will mark test succeeded. And this is not what developer expects. In the above example test will be marked as successful while the test itself it damaged and the target expression even didn't called (means test just skipped).

Example of damaged test, which intended to test failing case:

test damaged_test() fail {
  let x = 1 / 0    // will fail here because of Divide By Zero
  x == 2           // this expression will not even be checked
  // but whole test will be marked as PASSED. Which is not what we want.
}

Proposed solution

Change the purpose and scope of the fail keyword:

the fail keyword should check the expression following it and return true only if expression fails.
to avoid skipping damaged tests I suggest to deprecate marking whole tests with fail keyword.

For example, it should be possible to write the following:

fail 1 == 2  // True
fail validator.send(d, r, ctx) // True if validator returns False (or fails)
fail expect Some(x) = None  // True
fail expect datum: MyDatum = inline_datum // True if inline_datum is not of type MyDatum

Why this problem is important?

For onchain validators it is more important to test failing cases than succeeding. Because we need to ensure that every possible non-standard or abnormal usage leads to fail and prohibits spending.

rvcas · 2024-06-12T14:55:33Z

rvcas
Jun 12, 2024
Maintainer

expect is already asset

0 replies

perseus-algol · 2024-06-12T17:44:19Z

perseus-algol
Jun 12, 2024
Author

If the described problem can be solved by using expect, then would you be so kind to give an example.

In turn, I will give a counterexample that demonstrates that expect cannot check the expression that causes the exception.

First example will work, because validator doesn't cause an exception, but just results in False:

fn validator1(value: Int) {
  value == 0
}

test must_fail_1() {
  let one = 2 - 1
  expect False = validator1(one)
  True
}

Second example has problems (test fails), which shows that expect doesn't work on failing expressions:

fn validator2(denominator: Int) {
  let x = 1 / denominator
  x == 1
}

test must_fail_2() {
  let denominator = 10 - 10
  expect False = validator2(denominator)
  True
}

Results:

    ┍━ demo/expect_as_assert ━━━━━━━━━━━━━━━━━━━
    │ PASS [mem: 2604, cpu: 1071066] must_fail_1
    │ FAIL [mem:  103, cpu:  660037] must_fail_2
    ┕━━━━━━━━━━━━━ 2 tests | 1 passed | 1 failed

I kindly encourage you to carefully read "Why it is a problem?" section from above to see the problem.

0 replies

KtorZ · 2024-06-12T19:13:21Z

KtorZ
Jun 12, 2024
Maintainer

@perseus-algol I do not understand either what problem are you trying to highlight here? As far as I can tell, the behavior is expected in both cases.

In must_fail_1, the validator returns False, and so the entire test pass. In must_fail_2, you are attempting a division by zero, which causes validator2 to error and never return anything. So the entire must_fail_2 execution fails which causes the test to be marked as FAIL, because it wasn't expected to.

Maybe you are confused by the semantic of expect and of Plutus errors. In particular, this:

test will be considered successful if it returns True

is not quite accurate. Or more specifically, to be considered successful a test must first and foremost return (a valid expression). An error (a.k.a fail) is not an expression, it's a program halt. An exception if you want. As soon as encountered, the entire execution halts. And that's what is happening with your division by zero.

0 replies

KtorZ · 2024-06-12T19:18:31Z

KtorZ
Jun 12, 2024
Maintainer

If I go back to your original description, it seems that what you would want is to ensure separation of fixture code from test logic. That is possible through the use of zero-arg functions in Aiken which are evaluated during compilation. So you could write:

fn validator2(denominator: Int) {
  let x = 1 / denominator
  x == 1
}

fn my_fixture() -> Bool {
  let denominator = 10 - 10
  validator2(denominator)
}

test must_fail_2() {
  expect False = my_fixture()
  True
}

Which would result in a compiler panic because my_fixture would be evaluated and "throws" during compilation.

   aiken::fatal::error
   crates/aiken-lang/src/gen_uplc.rs:4468:33

       Evaluated a zero argument function and received this error: DivideByZero(
       1,
       0,
   )

It's not as pretty as it could be, but it likely would solve your problem?

0 replies

perseus-algol · 2024-06-12T20:43:48Z

perseus-algol
Jun 12, 2024
Author

@KtorZ, thank you for elaborating on this topic.

In your example my_fixture is not a fixture because it contains validator call. But I understand proposed solution.

So in order to check failing case, we actually need to follow the pattern:

fn validator_fn(denominator: Int) {
  let x = 1 / denominator
  x == 1
}

fn calc(a: Int, b: Int) {
  a / b
}

fn test1_fixtures() {
  calc(1, 0)
}

test test1() fail {
  let f = test1_fixtures()
  validator_fn(f)
}

Therefore, to test a case where we expect to fail, we must extract all preparations into zero-arg function (so it will be evaluated at compile time). Then make a test function marked with fail keyword with two expressions: 1) get fixtures, 2) call validator which we expect to fail. And since fixtures defined in zero-arg function, we can be confident that they evaluated successfully (otherwise we'll get a compilation error). And fail could only happen in validator call.

For me that pattern doesn't look a convenient dev experience, honestly. Something similar to try catch (for tests only, of course) would be better, IMO (see proposal in 1st message).

0 replies

KtorZ · 2024-06-12T22:27:45Z

KtorZ
Jun 12, 2024
Maintainer

Honestly I am not sure what problem you have exactly with fixture code causing test to fail. If your fixture is not working as expected, then the test as a whole is not working as expected.

So I don't see any problem with the current behavior? Especially in the example you gave. If you're dividing by 0, then surely you test fails, although I agree, maybe not for the reasons you expect originally. But this is why traces are useful; and primitives like "expect" will leave a trace that would let you know whether your assertion failed or not.

Unfortunately, try/catch or something of that nature is not possible on the Plutus Virtual Machine. Errors cannot exist at the value level. There are various CIPs on the topic if you're interested to dig in. So that's not a problem we can solve at the Aiken level other than in the way I pointed above if you really want to separate preparation code from your test logic.

0 replies

perseus-algol · 2024-06-13T10:27:32Z

perseus-algol
Jun 13, 2024
Author

So I don't see any problem with the current behavior?

While we found a way to test failing cases, the only DevX and convenience remain.

In other languages among assertions there is assertThrow which ensures that passed function throws an exception. This is how we could write the test from above example, if we'd have same feature. Let's use fail keyword here as assert throw:

test test1() {
  let arg = calc(0, 1)
  fail validator_fn(arg)
}

But since we don't have such a keyword/feature, we must extract fixtures into zero-arg function like so:

fn test1_fixtures() {
  calc(0, 1)
}

test test1() fail {
  let f = test1_fixtures()
  validator_fn(f)
}

So you could see the difference. First example with proposed fail assertion is more concise and convenient than the second one.

Currently developers must always extract fixture into zero-arg function for every fail test. fail test must consists of exactly: 1) get fixtures from zero-arg function(s). 2) target expression, i.e. validator call.

Due to the fact that described pattern is not documented and is not too obvious, developers are most likely will write it in seemingly natural way:

test test1() fail {
  let arg = calc(1, 0) // Here we got an exception, in fixtures preparation. Which will make test PASS. 
  // Which is a mistake, Because the validator wasn't even called and the error occurred before checking it.
  validator_fn(arg)
}

And that will be a mistake (which could cost a lot in the world of crypto).

A programming language should help you write correct code. In the case of testing failings, currently, it is easy to write incorrect code. And this can be improved by the way I described in the first message:

the fail keyword should check the expression following it and return true only if expression fails.
to avoid skipping damaged tests I suggest to deprecate marking whole tests with fail keyword.

Unfortunately, try/catch or something of that nature is not possible on the Plutus Virtual Machine.

I think that this is fine. My proposal does not affect Plutus Virtual Machine. It could be implemented at aiken compiler level. Suppose that we have following test with two fail expressions:

test test1() {
  let arg = calc(0, 1)
  fail validator_fn_1(arg)
  fail validator_fn_2(arg)
  arg == 0 // fail don't have to be the last expression, we can write more expressions after
}

then it should be compiled into:

fn test1_fixtures() {
  calc(0, 1)
}

test test1_1() fail {
  let arg = test1_fixtures()
  validator_fn_1(arg)
}

test test1_2() fail {
  let arg = test1_fixtures()
  validator_fn_2(arg)
}

test test1_3() {
  let arg = test1_fixtures()
  arg == 0
}

I think this is possible to achieve (without touching Plutus VM). What will make aiken more convenient for developers and less error-prone.

0 replies

KtorZ · 2024-06-13T12:04:17Z

KtorZ
Jun 13, 2024
Maintainer

I converted this item into a discussion. I understand where you're coming from, but the complexity that this introduces on the parser, test framework and everything down the line linked to it cannot be considered lightly. I am not convinced that the use-case you highlight is sufficient to justify of that complexity. I agree that it could be more convenient, but there are more pressing issues we have to address.

I'd also remind that it is generally seen a good practice (TDD) to see a test fail first. This would allow catching wrong preparation code that is throwing and not consider a test as failing.

Also, a good unit test supposedly test one and one thing. So cases like shown above where multiple validator functions are part of the same test are in a way looking for troubles.

So all-in-all, I'd like to leave the discussion open for anyone to comment and share experiences / use-cases, and come back to it later if we agree to go down that path.

0 replies

perseus-algol · 2024-06-13T12:56:24Z

perseus-algol
Jun 13, 2024
Author

I'd also remind that it is generally seen a good practice (TDD) to see a test fail first. This would allow catching wrong preparation code that is throwing and not consider a test as failing.

In my examples incorrect preparation will fail.

So cases like shown above where multiple validator functions are part of the same test are in a way looking for troubles.

Multiple validator calls provided only to show compilation logic. I'm not suggesting this as a good practice.

So nothing wrong with my examples.

I am not convinced that the use-case you highlight is sufficient to justify of that complexity. I agree that it could be more convenient, but there are more pressing issues we have to address.

Understood.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A problem with Testing failures. Proposed solution: Change `fail` keyword to use it with expressions. #961

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 9 comments

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

A problem with Testing failures. Proposed solution: Change fail keyword to use it with expressions. #961

perseus-algol Jun 7, 2024

The problem

Why it is a problem?

Proposed solution

Why this problem is important?

Replies: 9 comments

rvcas Jun 12, 2024 Maintainer

perseus-algol Jun 12, 2024 Author

KtorZ Jun 12, 2024 Maintainer

KtorZ Jun 12, 2024 Maintainer

perseus-algol Jun 12, 2024 Author

KtorZ Jun 12, 2024 Maintainer

perseus-algol Jun 13, 2024 Author

KtorZ Jun 13, 2024 Maintainer

perseus-algol Jun 13, 2024 Author

A problem with Testing failures. Proposed solution: Change `fail` keyword to use it with expressions. #961

perseus-algol
Jun 7, 2024

rvcas
Jun 12, 2024
Maintainer

perseus-algol
Jun 12, 2024
Author

KtorZ
Jun 12, 2024
Maintainer

KtorZ
Jun 12, 2024
Maintainer

perseus-algol
Jun 12, 2024
Author

KtorZ
Jun 12, 2024
Maintainer

perseus-algol
Jun 13, 2024
Author

KtorZ
Jun 13, 2024
Maintainer

perseus-algol
Jun 13, 2024
Author