Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(ec2): remove SSM session state w/ best effort approaches. #5616

Merged
merged 233 commits into from
Oct 25, 2024

Conversation

Hweinstock
Copy link
Contributor

@Hweinstock Hweinstock commented Sep 17, 2024

Problem

The remote connection through VSCode does not terminate the SSM session on close consistently.

Notes:

  • We do not get an event when the remote window is closed.

Solution

Leverage two best-effort approaches:

  • only allow a single connection from the toolkit to any given EC2 Instance. If a customer attempts to open another remote window in an EC2 instance, we can use that as a sign to terminate the old session.
  • on toolkit shutdown (deactivate), remote any sessions that are still running.

Implementation Details

  • Implement Ec2RemoteEnvManager to manage the remote environments. Behaves like a map of instance ids to sessions ids that most importantly maintains the invariant that any deleted item has its session terminated.
  • Refactor packages/core/src/awsService/ec2/commands.ts and packages/core/src/awsService/ec2/activation.ts to allow for state tracking in EC2ConnectionManager. This change also gives us an opportunity to improve the testing infrastructure for this code.

License: I confirm that my contribution is made under the terms of the Apache 2.0 license.

Hweinstock and others added 30 commits July 7, 2023 14:03
@Hweinstock Hweinstock marked this pull request as ready for review September 24, 2024 14:09
@Hweinstock Hweinstock requested a review from a team as a code owner September 24, 2024 14:09
@@ -100,3 +101,14 @@ export class Ec2InstanceNode extends AWSTreeNodeBase implements AWSResourceNode
await vscode.commands.executeCommand('aws.refreshAwsExplorerNode', this)
}
}

export async function refreshExplorerNode(node?: Ec2Node) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have a generalized function somewhere like this already.

Copy link
Contributor Author

@Hweinstock Hweinstock Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, this is just a wrapper around that function that makes sure the node exists. Avoids having the same logic in each command where node is optional. I changed the name to tryRefreshNode to make this clearer.

Comment on lines 72 to 73
const newConnectionManager = new Ec2ConnectionManager(selection.region)
connectionManagers.set(selection.region, newConnectionManager)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a bit subtle that this is actually adding to a global long-lived collection. This logic should live closer to that collection, possibly in a class or something.

Also need to log a warning if the number of connections exceeds a big number, until we have a better way to have visbility and validation of these long-lived collections (which can caused hidden bugs / perf issues).

connectionManagers = new Map<string, Ec2ConnectionManager>()
})

it('only creates new connection managers once for each region ', async function () {
Copy link
Contributor

@justinmk3 justinmk3 Oct 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes me wonder if there should be 1 manager for everything. having many managers seems like an unnecessary problem, and partly defeats the purpose of a "manager", which is to have one place where visibility/handling can happen.

Copy link
Contributor Author

@Hweinstock Hweinstock Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I think the word manager is being overused here. Manager also isn't very informative about what each component is responsible for. I am leaning towards renaming them to directly describe what they "do".
Ex. Ec2ConnectionManager --> Ec2Connecter, Ec2RemoteSessionManager --> Ec2SessionTracker. Does that seem like a good direction?

import { Ec2RemoteSessionManager } from '../../../awsService/ec2/remoteSessionManager'
import { SsmClient } from '../../../shared/clients/ssmClient'

describe('Ec2RemoteSessionManager', async function () {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a session manager for codecatalyst dev envs? if so, it may be worth generalizing it. All of this new test code suggests that it's worth having 1 generalized concept instead of several specific ones.

Copy link
Contributor Author

@Hweinstock Hweinstock Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For CodeCatalyst, we start the connection within the connect script here

exec "$AWS_SSM_CLI" "{\"streamUrl\":\"$STREAM_URL\",\"tokenValue\":\"$TOKEN\",\"sessionId\":\"$SESSION\"}" "$AWS_REGION" "StartSession"
so we don't have any easy way of getting the resulting SessionId in memory (as far as I can tell). Also, I am unsure if the same problem of sessions not terminating still happens in CodeCatalyst due to some of these implementation differences.


export async function tryRefreshNode(node?: Ec2Node) {
if (node) {
const n = node instanceof Ec2InstanceNode ? node.parent : node
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be generalized. I think possibly was, on the https://github.com/aws/aws-toolkit-vscode-staging/tree/feature/lambda-get-started branch.

No action needed here, for now.

import { SsmClient } from '../../shared/clients/ssmClient'
import { Disposable } from 'vscode'

export class Ec2SessionTracker extends Map<EC2.InstanceId, SSM.SessionId> implements Disposable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume we'll generalize this later, e.g. for ECS sessions and possibly others in the future.

@@ -3,7 +3,11 @@
* SPDX-License-Identifier: Apache-2.0
*/

import { SSM } from 'aws-sdk'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the "v2" sdk. may want to double-check if we already have the v3 SDK for ssm-related stuff (just by searching the codebase). that will avoid needing to migrate this later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, @aws-sdk/client-ssm is not a dependency yet which I think is what we want based on (https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-aws-sdk-client-ssm/Interface/StartSessionResponse/)

@justinmk3 justinmk3 merged commit 142761e into master Oct 25, 2024
23 of 25 checks passed
@justinmk3 justinmk3 deleted the hkobew/ec2/cleanUpSSM branch October 25, 2024 21:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants