Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API Suggestion: Arbitraries, Generators and Transformations #1

Open
jlink opened this issue Aug 9, 2024 · 4 comments
Open

API Suggestion: Arbitraries, Generators and Transformations #1

jlink opened this issue Aug 9, 2024 · 4 comments

Comments

@jlink
Copy link
Contributor

jlink commented Aug 9, 2024

Arbitraries and generators are fundamental concepts; so is transforming them into other arbitraries.
Jqwik 1 already has a well-working API for that, so I suggest keeping the API similar where appropriate
and making it simpler where possible.

Arbitrary and Generator

Let's start with the two fundamental interfaces:

public interface Arbitrary<T> {
    Generator<T> generator();
}

public interface Generator<T> {
    T generate(GenSource source);
}

Differences to Jqwik 1

  • Arbitrary.generator() no longer takes a genSize: This parameter turned out to be of little use in most cases and made caching of generators much more difficult.

  • Generator - used to be called RandomGenerator - can now directly generate values through generate(GenSource source) method. The detour over a Shrinkable type is no longer necessary.

  • The concept of GenSource, which is new, will be discussed in other issues.

Standard Transformations

Mapping, filtering, flat-mapping etc. should work as before. The same is true for creating lists, sets etc.
Thus, there'll be a couple of default methods in Arbitrary:

public interface Arbitrary<T> {

	Generator<T> generator();

	default <R> Arbitrary<R> map(Function<T, R> mapper) { ... }

	default <T> Arbitrary<T> filter(Predicate<T> filter) { ... }

	default <R> Arbitrary<R> flatMap(Function<T, Arbitrary<R>> mapper) { ... }

	default ListArbitrary<T> list() { ... }

	default SetArbitrary<T> set() { ... }
}

Filter or Include / Exclude?

Since the term filter is somewhat ambiguous, a few libraries have switched to provide include, exclude, filterIn, filterOutor similar clarifying terms. Is this a clarification or does it go against expectations?

@jlink jlink changed the title API Suggestion: Creating and Combining Arbitraries API Suggestion: Creating and Transforming Arbitraries Aug 9, 2024
@jlink jlink changed the title API Suggestion: Creating and Transforming Arbitraries API Suggestion: Arbitraries, Generators and Transformations Aug 9, 2024
@duponter
Copy link

The term filter is indeed ambiguous and should be replaced by a more distinct name that implies its intent more clearly.
The include - exclude pair is a good candidate, but I also came across select - reject which works for me.

@SimY4
Copy link

SimY4 commented Aug 12, 2024

A thought provoking message:

Why do you think arbitrary and generator should exist as two separate interfaces? They map one to one - one arbitrary summon one generator and generator can always be wrapped in an arbitrary.

This anachronism is being copied from Haskell over and over but still no values outside of Haskell has been provided. In Haskell Arbitrary is a type class and Gen is a function for producing new values. Arbitrary exists at the type level for compiler to summon the right generator.

In Java or Kotlin - there're no type classes. Arbitrary is just an interface that adds you one more call to get what you want which is a Generator.

@jlink
Copy link
Contributor Author

jlink commented Aug 13, 2024

For me Arbitrary is the abstraction that can be transformed and combined. The actual generation of values is done by the framework within a property's validation lifecycle. So Generator could be hidden from the normal user; it's visible only for the rare cases where you want to generate data outside of a property validation.

Thinking of it, having sample() and sampleStream() would be enough and the actual Generator would become an implementation detail.

@jlink
Copy link
Contributor Author

jlink commented Aug 13, 2024

The disadvantage of hiding the generator is that it's no longer possible to implement a fully working arbitrary outside the core module, because the generator hides some necessary details, e.g. how to create edge cases and how to do exhaustive generation. Bringing those details into Arbitrary would IMO make the interface too messy.

So the split of Arbitrary and Generator is a separation of concerns. The open question for me is if the concerns addressed in Generator should be public in the first place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants