Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Network Policy test #395

Merged
merged 2 commits into from
Jan 12, 2024
Merged

Adding Network Policy test #395

merged 2 commits into from
Jan 12, 2024

Conversation

jaydeokar
Copy link
Contributor

@jaydeokar jaydeokar commented Dec 4, 2023

Issue #, if available:
N/A
Description of changes:
go test -v --kubeconfig=$KUBECONFIG
This adds a basic test for aws-vpc-cni Network Policy. The test does the following -

  • Installs the latest vpc-cni using helm chart and enables network policy support on the cluster
  • Creates three servers
    • Verifies all are reachable when no network policy is applied
    • Verifies traffic is blocked when network policy is applied (which proves the bpf program was loaded)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@jaydeokar jaydeokar force-pushed the netpol_test branch 2 times, most recently from 44dfb0c to 2aba942 Compare December 4, 2023 22:29
}

return ctx, nil
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add waiter(s) here so that we don't try to run the test cases until your pre-reqs are met?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 111 to 152
for _, resource := range resources {
if _, err := fwext.ResourceExists(resource, releaseNamespace, config.KubeconfigFile()); err == nil {
log.Printf("Patching resource: %s labels and annotations..", resource)
fwext.Label(resource, "app.kubernetes.io/managed-by=Helm", releaseNamespace, config.KubeconfigFile())
fwext.Annotate(resource, fmt.Sprintf("meta.helm.sh/release-name=%s", releaseName), releaseNamespace, config.KubeconfigFile())
fwext.Annotate(resource, fmt.Sprintf("meta.helm.sh/release-namespace=%s", releaseNamespace), releaseNamespace, config.KubeconfigFile())
} else {
log.Printf("Resource %s does not exist, skipping patching labels/annotations..", resource)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the goal of this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So vpc-cni is installed by default on a cluster and we need to import the resource into the helm chart (apply annotations/labels). More details here -
https://github.com/aws/eks-charts/tree/master/stable/aws-vpc-cni#adopting-the-existing-aws-node-resources-in-an-eks-cluster

I've updated this to delete the resources instead of patching for labels/annotations

wait.For(conditions.New(config.Client().Resources()).ResourceMatch(cniDS, func(object k8s.Object) bool {
d := object.(*appsv1.DaemonSet)
status := false
if d.Status.DesiredNumberScheduled == d.Status.NumberReady {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we need a e2e-framework release to use this utility. I've added a TODO to update to use this when we have a new version of the framework

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could pull in the latest changes with a go v0.0.0-$COMMIT version, but you could also use the one I've put in the frameworkextensions package: https://github.com/aws/aws-k8s-tester/blob/main/e2e2/internal/framework_extensions/conditions.go#L32

Copy link
Member

@cartermckinnon cartermckinnon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My biggest concern is the use of the VPC CNI helm chart, which could change between executions of this test case. I assume you don't plan to use this test case to qualify the network policy agent itself (because it's using the prod helm chart), so it'd be more stable to at least pin a specific version of the chart.

@@ -17,7 +17,7 @@ COPY kubetest2/ .
RUN go install ./...
WORKDIR $GOPATH/github.com/aws/aws-k8s-tester/e2e2
COPY e2e2/ .
RUN go test -c ./test/cases/nvidia -o $GOPATH/bin/e2e-nvidia
RUN go test -c ./test/cases/nvidia -o $GOPATH/bin/e2e-nvidia && go test -c ./test/cases/netpol -o $GOPATH/bin/e2e-netpol
Copy link
Member

@cartermckinnon cartermckinnon Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this to a separate RUN for readability? We're not too concerned about minimizing the number of layers in this image

(or add a line break)

@@ -51,4 +51,5 @@ ENV PATH=$PATH:/info
RUN cp kubernetes-version.txt /info/
RUN mv kubernetes/*/bin/* /bin/
RUN rm -rf /workdir
RUN curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 && chmod 700 get_helm.sh && VERIFY_CHECKSUM=false ./get_helm.sh && rm -rf ./get_helm.sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line breaks for readability

Suggested change
RUN curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 && chmod 700 get_helm.sh && VERIFY_CHECKSUM=false ./get_helm.sh && rm -rf ./get_helm.sh
RUN curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 && \
chmod 700 get_helm.sh && \
VERIFY_CHECKSUM=false ./get_helm.sh && \
rm -rf ./get_helm.sh

client, err := config.NewClient()

if err != nil {
log.Print("Failed to get client")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of logging before returning errors throughout this function, you could use errors.Wrap if the added context is really valuable

Comment on lines 118 to 129
// Delete Cluster role
client.Resources().Delete(ctx, &rbac.ClusterRole{ObjectMeta: metav1.ObjectMeta{Name: "aws-node", Namespace: releaseNamespace}})

// Delete Cluster-Role Binding
client.Resources().Delete(ctx, &rbac.ClusterRoleBinding{ObjectMeta: metav1.ObjectMeta{Name: "aws-node", Namespace: releaseNamespace}})

// Delete ServiceAccount
client.Resources().Delete(ctx, &v1.ServiceAccount{ObjectMeta: metav1.ObjectMeta{Name: "aws-node", Namespace: releaseNamespace}})

// Delete Daemonset
client.Resources().Delete(ctx, &appsv1.DaemonSet{ObjectMeta: metav1.ObjectMeta{Name: "aws-node", Namespace: releaseNamespace}})

// // Delete Configmap
client.Resources().Delete(ctx, &v1.ConfigMap{ObjectMeta: metav1.ObjectMeta{Name: "amazon-vpc-cni", Namespace: releaseNamespace}})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fragile to me, we'll have to keep this in sync with the resources that the managed addon for VPC CNI provides

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. I don't think we have an easier way to import the resources into the helm chart at the moment.
I also considered using eks managed addons to avoid this, but that would mean adding a extra dependency which doesn't seem right here.

I am going to try few more things to see if that helps remove this.
That being said, we have not added new resources to this chart in a while..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a way to not keep this resource mapping by using helm template followed by applying the rendered manifest file. However we would need to use another go library for helm client as the inbuilt helm package (e2e-framework) only validates if the chart can be rendered.

@jaydeokar
Copy link
Contributor Author

My biggest concern is the use of the VPC CNI helm chart, which could change between executions of this test case. I assume you don't plan to use this test case to qualify the network policy agent itself (because it's using the prod helm chart), so it'd be more stable to at least pin a specific version of the chart.

Yes, this tests is not to qualify Network Policy agent. The main goal of this is to verify ebpf programs gets loaded on the node whenever a new AMI is being released (mostly to catch issues when there is a kernel version update).. I explicitly did not pin this to a specific version of helm chart because we need to ensure only the latest versions of network policy agent is compatible

@jaydeokar jaydeokar force-pushed the netpol_test branch 2 times, most recently from b441367 to 869e422 Compare January 4, 2024 19:38
@cartermckinnon cartermckinnon merged commit fc54f0d into aws:main Jan 12, 2024
2 checks passed
@cartermckinnon
Copy link
Member

Thanks for all the work on this @jaydeokar !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants