Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: record aliases in provider #1475

Closed
wants to merge 1 commit into from
Closed

Conversation

mjeffryes
Copy link
Member

This needs some cleaning still, but outlines a possible approach for recording the aliases in the provider shim so we can fully pack and unpack the provider info.

I've tested this locally using the aws provider (pulumi/pulumi-aws#2902). It appears to be working, and to offer a fairly significant speed up (>150ms in my local testing).

Feedback that would be most helpful:

  1. ideas for how to test
  2. Any suggestions for how to inject the aliasing with a lighter touch

Comment on lines +242 to +243
resources shimUtil.AliasingResourceMap
dataSources shimUtil.AliasingResourceMap
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the most expedient way to make the aliases reachable for aws.

Comment on lines +303 to +307
type AliasingProviderShim struct {
ProviderShim
resources shimUtil.AliasingResourceMap
datasources shimUtil.AliasingResourceMap
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here was to introduce a wrapper for muxer.ProviderShim, just as I did for the base shim.Provider, but I took a short-cut and just modified ProviderShim directly for the POC

Comment on lines 56 to 57
panic("Set not supported - is it possible to treat this as immutable?")
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This experiment has given me pretty good confidence that we're only using Set to add aliases today (Although since it's a public interface, it's possible for random providers to be calling it somewhere. It would be good to remove it in a future major release of the bridge.)

Comment on lines +35 to +43
if p, ok := info.P.(*shimUtil.AliasingProvider); ok {
resources = p.ResourceMap
datasources = p.DataSourceMap
} else if p, ok := info.P.(*pfmuxer.ProviderShim); ok {
resources = p.ResourcesMap().(shimUtil.AliasingResourceMap)
datasources = p.DataSourcesMap().(shimUtil.AliasingResourceMap)
} else {
panic("can't find aliases from this provider")
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ugly, but lets us discover aliases without adding a RangeAliases method to the base shim.Provider

Comment on lines +57 to +70
marshalled := WithProviderAliases{
Info: *tfbridge.MarshalProviderInfo(info),
ResourceAliases: resourceAliases,
DatasourceAliases: datasourceAliases,
}
jsoni := jsoniter.ConfigCompatibleWithStandardLibrary
return jsoni.Marshal(marshalled)
}

type WithProviderAliases struct {
Info tfbridge.MarshallableProviderInfo `json:"info"`
ResourceAliases map[string]string `json:"resource_aliases,omitempty"`
DatasourceAliases map[string]string `json:"datasource_aliases,omitempty"`
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discussed on pulumi/pulumi-aws#2902, a further improvement would be to just codegen the serialized form so the runtime can just load a bunch of maps directly into memory.

p.P.ResourcesMap().Set(legacyResourceName, p.P.ResourcesMap().Get(resourceName))
p.P.ResourcesMap().AddAlias(legacyResourceName, resourceName)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the change below seem to eliminate the runtime use of Set on the bridge

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we save this observation in a ticket? That'd be a good change to deprecate Set - except for this use case, which can be better named.

@mjeffryes
Copy link
Member Author

Ian pointed out that this is probably a fruitless direction to pursue because there are too many non-serializable hooks in the ResourceInfo that are needed at runtime. We will have to take a more piecemeal approach.

@mjeffryes mjeffryes closed this Oct 26, 2023
@t0yv0
Copy link
Member

t0yv0 commented Oct 26, 2023

You're so close @mjeffryes

I think a lot of this PR would still be needed.

It's an excellent observation that ResourceInfos have non-serializable things like Transformers, I apologize for missing that crucial bit in earlier design discussions. So we cannot turn them around through a serde boundary.

However as you're showing here it's entirely viable to capture side-effects at tfgen time, serialize them, and replay these side-effects at init-time to modify ResourceMap and ResourceInfos.

In case of ResourceMap, you've done most of the work here - it's AddAlias. Now it has to spy on these calls and do a type AddAlias struct { alias string, target string } trick to save aliasing at tfgen and replay at init-time.

Now, we need to also capture side-effects performed against ResourceInfo & friends.

It seems like we could try refactoring aliasing code to trace the effects.

Note the comment:

// Specifically, [ApplyAutoAliases] may perform the following actions:
//
// - Call [ProviderInfo.RenameResourceWithAlias] or [ProviderInfo.RenameDataSource]
// - Edit [ResourceInfo.Aliases]
// - Edit [SchemaInfo.MaxItemsOne]

If we could turn these into a data structure.. Basically ApplyAutoAliases could be doing two things. At tfgen time it does what it currently does, plus spies on the above effects and records them; at init-time it simply replays pre-recorded effects.

@mjeffryes
Copy link
Member Author

I think a lot of this PR would still be needed.
I do think there is some value in this code just by virtue of working toward shrinking the mutable surface of shim.ResourceMap, I would be happy to refine this for that purpose if there's interest.

However, I think that it's not likely to be useful for the init latency problem.

If we could turn these into a data structure.. Basically ApplyAutoAliases could be doing two things. At tfgen time it does what it currently does, plus spies on the above effects and records them; at init-time it simply replays pre-recorded effects.

Yes, Ian and I came to the same conclusion. In fact, the first two steps (the actual aliases are already captured as a list of effects and then run in a second pass in ApplyAutoAliases, so we just need to make these data rather than lambdas and convert the MaxItemsOne transformations to use a similar approach. (In fact Ian pointed out that just switching MaxItemsOne to a two pass approach will save a lot of GC pressure from creating empty maps and discarding them.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants