Create a set of seed data #1318

mmurto · 2024-10-29T08:51:41Z

We should have an easily bootstrappable set of seed data that can be used for development and manual testing, and later automated testing. This would provide a good starting point for both backend and UI development to ensure a real (though quite small) database to test the UI, API and migrations against.

The data should include organizations, projects and repositories, and some runs for repositories that include issues, rule violations and vulnerabilities. The data should also include correct permissions in Keycloak for a test user to be able to use the data.

Some possible ways to create and maintain the data:

Create a script that executes API calls against the Docker Compose environment to first get a Keycloak token and then create the hierarchy items and runs for some known repositories. Adjust the rules to produce the needed rule violations against the known repositories.
Create a script that does the above but with ORT Server library functions or SQL.
Add a database dump of a good dataset to the repository that will automatically be included when starting the Docker Compose environment.

sschuberth · 2024-10-29T11:38:06Z

Maybe this could later also be made a part of #1319.

mmurto · 2024-10-29T11:39:52Z

Maybe this could later also be made a part of #1319.

What would be the use-case for that?

sschuberth · 2024-10-29T11:46:45Z

What would be the use-case for that?

I don't really get the question. What I was thinking about aloud was to add e.g. a create-test-data sub-command to the planned CLI that pretty much does what your point 2. describes above.

mmurto · 2024-10-29T11:52:17Z

What would be the use-case for that?

I don't really get the question. What I was thinking about aloud was to add e.g. a create-test-data sub-command to the planned CLI that pretty much does what your point 2. describes above.

I meant the use-case for having it in the CLI, which I guess basically asks is that when will an end-user want to seed an instance. IMO the main (maybe only) users for seed data are developers, and depending on the format, seed data can be relatively large in size, so I'm not sure if it makes sense to ship it to the CI runners.

sschuberth · 2024-10-29T12:49:59Z

I meant the use-case for having it in the CLI

It's not really about a "use-case", but for our convenience: The CLI already has the build infrastructure set up to consume ORT Server artifact for programmatic use. The same infrastructure that we'd need a tool (or multiple tools) to create / seed test data.

will an end-user want to seed an instance.

Probably not, but I don't think it matters much to "hide" such capabilities in an end-user CLI. But maybe it does. Like I said, I was just think out aloud.

seed data can be relatively large in size

But wouldn't our tool just implicitly create the (large parts of) seed data by creating runs, and not really ship with the data?

mmurto · 2024-10-29T12:57:23Z

seed data can be relatively large in size

But wouldn't our tool just implicitly create the (large parts of) seed data by creating runs, and not really ship with the data?

Depends on the approach, but agreed, if it's done through API calls rather than stored data like in approach 3, then the amount of data is not a lot. I'm not very familiar with Kotlin projects, but I'd guess even if it's wrapped in the CLI, it would be easy to call like git clone && docker compose up && ./gradlew cli seed or something like that?

sschuberth · 2024-10-29T13:17:14Z

./gradlew cli seed or something like that?

Something like that. Instead of involving Gradle, the CLI would be called like ort-server seed.

mmurto · 2024-10-29T13:24:38Z

./gradlew cli seed or something like that?

Something like that. Instead of involving Gradle, the CLI would be called like ort-server seed.

IMO it would be great for the seed command to work without installing/adding anything to path with the whatever is the current checked out revision, so I think involving Gradle would be good here? As said, not too familiar with Kotlin projects.

sschuberth · 2024-10-29T13:52:50Z

so I think involving Gradle would be good here?

Yes, implementing this via Gradle tasks is also possible, and probably preferable than putting it into a stand-alone CLI.

sschuberth added the enhancement New feature or request. label Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create a set of seed data #1318

Create a set of seed data #1318

mmurto commented Oct 29, 2024

sschuberth commented Oct 29, 2024

mmurto commented Oct 29, 2024

sschuberth commented Oct 29, 2024 •

edited

Loading

mmurto commented Oct 29, 2024 •

edited

Loading

sschuberth commented Oct 29, 2024

mmurto commented Oct 29, 2024

sschuberth commented Oct 29, 2024

mmurto commented Oct 29, 2024

sschuberth commented Oct 29, 2024

Create a set of seed data #1318

Create a set of seed data #1318

Comments

mmurto commented Oct 29, 2024

sschuberth commented Oct 29, 2024

mmurto commented Oct 29, 2024

sschuberth commented Oct 29, 2024 • edited Loading

mmurto commented Oct 29, 2024 • edited Loading

sschuberth commented Oct 29, 2024

mmurto commented Oct 29, 2024

sschuberth commented Oct 29, 2024

mmurto commented Oct 29, 2024

sschuberth commented Oct 29, 2024

sschuberth commented Oct 29, 2024 •

edited

Loading

mmurto commented Oct 29, 2024 •

edited

Loading