Decouple legacy entity tables from results queries #4342

JAORMX · 2024-09-02T13:46:56Z

Summary

This is another step into moving away from the central entity tables and
into the new per-entity tables.

Fixes #4316

Change Type

Mark the type of change your PR introduces:

Bug fix (resolves an issue without affecting existing features)
Feature (adds new functionality without breaking changes)
Breaking change (may impact existing functionalities or require documentation updates)
Documentation (updates or additions to documentation)
Refactoring or test improvements (no bug fixes or new functionality)

Testing

Review Checklist:

Reviewed my own code for quality and clarity.
Added comments to complex or tricky code sections.
Updated any affected documentation.
Included tests that validate the fix or feature.
Checked that related changes are merged.

internal/controlplane/handlers_evalstatus.go

internal/entities/properties/service/service.go

internal/history/service.go

jhrozek

Great work, this is not easy code to modify. About the issues with the code referencing repoName and repoOwner separately, it looks like we're either constructing the repo slug again anyway or just returning them in a generic map[string]string through the entityInfo variable, so it should be possible to use the name straight away hopefully?

coveralls · 2024-09-03T11:49:22Z

coverage: 52.849% (-0.2%) from 53.06%
when pulling 720aac6 on results-entity-instances
into 5428799 on main.

internal/controlplane/handlers_profile.go

jhrozek · 2024-09-03T20:43:35Z

internal/controlplane/handlers_profile.go

+	entityInfo["entity_type"] = efp.Entity.Type.ToString()
+	entityInfo["entity_id"] = rs.EntityID.String()
+
+	// temporary: These will be replaced by entity_id


as well as these could be simplified?

I think we'll have a set of properties we output and have a general way of displaying them, so this will be replaced with a provider-specific call IMO.

jhrozek · 2024-09-03T20:49:07Z

internal/controlplane/handlers_profile.go

+			artRepoOwner := efp.Properties.GetProperty(ghprop.ArtifactPropertyRepoOwner).GetString()
+			artRepoName := efp.Properties.GetProperty(ghprop.ArtifactPropertyRepoName).GetString()
+			if artRepoOwner != "" && artRepoName != "" {
+				repoPath = fmt.Sprintf("%s/%s", artRepoOwner, artRepoName)


Is this name ever going to be used in anything but presentation? I wonder if we should make the RepoOwner and RepoName properties not tied to github provider (and move them into contants.go like we have properties.RepoPropertyIsPrivate). Then we could use this code for non-github artifacts and maybe even make the code more generic by constructing new properties and calling provider.GetEntityName(REPOSITORY, theNewProperties)

Unfortunately, our output is heavily tied to GitHub, this is actually used to get the alert which won't exist in other providers. I can't wait to deprecate that.

Answering your question, I still think it's a github-specific attribute.

jhrozek · 2024-09-03T20:49:45Z

internal/db/eval_history_test.go

@@ -70,8 +70,6 @@ func TestListEvaluationHistoryFilters(t *testing.T) {
 				require.Equal(t, es1, row.EvaluationID)
 				require.Equal(t, EntitiesRepository, row.EntityType)
 				require.Equal(t, repo1.ID, row.EntityID)
-				require.Equal(t, repo1.RepoOwner, row.RepoOwner.String)
-				require.Equal(t, repo1.RepoName, row.RepoName.String)


why did we have to remove these?

Because else we'd need to return the repo owner via a JOIN in the ListEvaluationHistory call. And the intent was to remove that. We should not rely on any entity-specific info.

jhrozek · 2024-09-03T20:56:34Z

internal/history/service.go

+	ehs := &evaluationHistoryService{
+		providerManager: providerManager,
+		propServiceBuilder: func(qtx db.ExtendQuerier) propertiessvc.PropertiesService {
+			return propertiessvc.NewPropertiesService(qtx)


I think it would be OK to just require the PropertiesService interface as a parameter but the options don't hurt..

It was done for testing only.

jhrozek · 2024-09-03T20:58:07Z

internal/history/service.go

+			return nil, fmt.Errorf("error fetching entity for properties: %w", err)
+		}
+
+		err = propsvc.RetrieveAllPropertiesForEntity(ctx, efp)


this is potentially a lot of calls isn't it? Do we actually need up-to-date data here, couldn't we just fetch the entities and properties from cache regardless of whether it's expired or not?

We are at a weird point where we don't have everything in cache just yet. The idea is to fetch from cache if it's possible, and persist it if it isn't there. Perhaps I should instead add "options" to this call and rely on the cache even if the data is marked as stale. wdyt?

jhrozek · 2024-09-03T21:01:50Z

internal/entities/properties/service/service.go

+	}
+
+	fetchByProps, err := properties.NewProperties(map[string]any{
+		properties.PropertyName: ent.Name,


Not saying this is right or wrong and it's not for this PR, but I was wondering whether the way I coded up fetching properties is correct - for example for PRs you need the PR number, the repo owner and repo name. So the way the fetcher retrieves the information is either 1) those properties are in the lookByProps properties structure or 2) the fetcher parses the name.

I wonder if doing 2) is too "magic" and if a function like this one should rather just return all the cached properties and let the fetcher decide based on the properties and the type of the entity.

jhrozek · 2024-09-03T21:07:10Z

internal/entities/properties/service/service.go

+// RetrieveAllPropertiesForEntity fetches a single property for the given an entity
+// for properties model. Note that properties will be updated in place.
+func (ps *propertiesService) RetrieveAllPropertiesForEntity(
+	ctx context.Context, efp *models.EntityForProperties,


This is fine, but do you think that in a follow up we could change RetrieveAllProperties to accept EntityForProperties or just an Entity that alrady has projectID, providerID and type and do something like:

newProps, err := ps.RetrieveAllProperties(ctx, prov, entity, lookupProps)
efp := NewEntityFromPropertiesWithInstance(entity, newProps)

jhrozek · 2024-09-03T21:09:38Z

internal/entities/models/models.go

+	*EntityWithProperties
+
+	// Provider is the provider for the entity
+	Provider provifv1.Provider


Why keep track of the provider and not just use the providerID in the EntityWithProperties? I wonder if the cleanest way would be to just pass the providerID to the propertyService, let the propertyService structure contain a providerManager and always instantiate the provider...

Good observation. I was just trying to make it easier to instantiate everything. But this could actually be moved to the RetrieveAllPropertiesForEntity so we would not need the new wrapper. Let me refactor this.

jhrozek

Thanks for the work!

The code reads easier now that it doesn't handle the entity type separately. I put some comments inline, they are mostly to make sure that the original design of the PropertyService wasn't bad to start with and that we're not adding hacks to address the shortcomings instead.

The biggest question I have is though - why do we call the Refreshes in this service at all and not just rely on the database properties? Wouldn't we cause a lot of upstream traffic this way?

This is another step into moving away from the central entity tables and into the new per-entity tables. Signed-off-by: Juan Antonio Osorio <[email protected]>

JAORMX · 2024-09-04T05:49:49Z

The biggest question I have is though - why do we call the Refreshes in this service at all and not just rely on the database properties? Wouldn't we cause a lot of upstream traffic this way?

I added a comment about that above. I think we can deal with that by adding options to the retrieve calls.

This is not needed and redundant. Instead the logic is moved towards the properties service Signed-off-by: Juan Antonio Osorio <[email protected]>

Signed-off-by: Juan Antonio Osorio <[email protected]>

jhrozek · 2024-09-04T08:49:59Z

internal/controlplane/handlers_profile.go

+	if rs.EntityType == db.EntitiesRepository {
+		entityInfo["repository_id"] = efp.Entity.ID.String()
+	} else if rs.EntityType == db.EntitiesArtifact {
+		entityInfo["artifact_id"] = efp.Entity.ID.String()


I think we could just use the upstream ID as a string here to simplify the logic (OK in a follow-up)

Do we use these for anything?

yeah, I don't know. Let's try removing them subsequently.

JAORMX force-pushed the results-entity-instances branch from 789b288 to 6465818 Compare September 2, 2024 13:51

JAORMX marked this pull request as draft September 2, 2024 13:56

jhrozek reviewed Sep 2, 2024

View reviewed changes

internal/controlplane/handlers_evalstatus.go Show resolved Hide resolved

jhrozek reviewed Sep 2, 2024

View reviewed changes

internal/entities/properties/service/service.go Show resolved Hide resolved

jhrozek reviewed Sep 2, 2024

View reviewed changes

internal/history/service.go Outdated Show resolved Hide resolved

jhrozek reviewed Sep 2, 2024

View reviewed changes

JAORMX force-pushed the results-entity-instances branch 9 times, most recently from 1755b21 to 2df7309 Compare September 3, 2024 11:01

JAORMX changed the title ~~Use decouple legacy entity tables from results queries~~ Decouple legacy entity tables from results queries Sep 3, 2024

JAORMX force-pushed the results-entity-instances branch 2 times, most recently from 5d15418 to a548c03 Compare September 3, 2024 11:39

JAORMX force-pushed the results-entity-instances branch from a548c03 to 8d84ddf Compare September 3, 2024 14:11

JAORMX requested a review from jhrozek September 3, 2024 14:15

JAORMX force-pushed the results-entity-instances branch from 8d84ddf to dd2dabe Compare September 3, 2024 16:29

JAORMX marked this pull request as ready for review September 3, 2024 18:48

jhrozek reviewed Sep 3, 2024

View reviewed changes

internal/controlplane/handlers_profile.go Outdated Show resolved Hide resolved

jhrozek reviewed Sep 3, 2024

View reviewed changes

Decouple legacy entity tables from results queries

fa3faac

This is another step into moving away from the central entity tables and into the new per-entity tables. Signed-off-by: Juan Antonio Osorio <[email protected]>

JAORMX force-pushed the results-entity-instances branch from dd2dabe to fa3faac Compare September 4, 2024 05:40

JAORMX added 2 commits September 4, 2024 09:09

Remove EntityForProperties model

f8bbee6

This is not needed and redundant. Instead the logic is moved towards the properties service Signed-off-by: Juan Antonio Osorio <[email protected]>

Use pointers for entity with properties, not values

720aac6

Signed-off-by: Juan Antonio Osorio <[email protected]>

JAORMX requested a review from jhrozek September 4, 2024 08:44

jhrozek reviewed Sep 4, 2024

View reviewed changes

jhrozek approved these changes Sep 4, 2024

View reviewed changes

JAORMX merged commit 1a5c09e into main Sep 4, 2024
22 checks passed

JAORMX deleted the results-entity-instances branch September 4, 2024 11:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decouple legacy entity tables from results queries #4342

Decouple legacy entity tables from results queries #4342

JAORMX commented Sep 2, 2024 •

edited

Loading

jhrozek left a comment

coveralls commented Sep 3, 2024 •

edited

Loading

jhrozek Sep 3, 2024

JAORMX Sep 4, 2024

jhrozek Sep 3, 2024

JAORMX Sep 4, 2024

JAORMX Sep 4, 2024

jhrozek Sep 3, 2024

JAORMX Sep 4, 2024

jhrozek Sep 3, 2024

JAORMX Sep 4, 2024

jhrozek Sep 3, 2024

JAORMX Sep 4, 2024

jhrozek Sep 3, 2024

jhrozek Sep 3, 2024

jhrozek Sep 3, 2024

JAORMX Sep 4, 2024

jhrozek left a comment

JAORMX commented Sep 4, 2024

jhrozek Sep 4, 2024

JAORMX Sep 4, 2024

jhrozek Sep 4, 2024

Decouple legacy entity tables from results queries #4342

Decouple legacy entity tables from results queries #4342

Conversation

JAORMX commented Sep 2, 2024 • edited Loading

Summary

Change Type

Testing

Review Checklist:

jhrozek left a comment

Choose a reason for hiding this comment

coveralls commented Sep 3, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jhrozek left a comment

Choose a reason for hiding this comment

JAORMX commented Sep 4, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JAORMX commented Sep 2, 2024 •

edited

Loading

coveralls commented Sep 3, 2024 •

edited

Loading