Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration test for EDOT agents with operator, plus skeleton for collector #16

Merged
merged 1 commit into from
Oct 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 68 additions & 0 deletions .github/workflows/operator-regression.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
name: Regression Testing Operator Integration

on:
workflow_dispatch:

env:
AGENT_TESTS: python nodejs java
#go dotnet --- both pending

jobs:
integration-test:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0

- name: Create Kind cluster and local Docker registry
run:
bash test/operator/kind-with-registry.sh

- name: Create Test Images
run: |
for t in ${AGENT_TESTS[@]}
do
echo "Creating image for $t"
docker build -t $t-test-app test/operator/$t
docker tag $t-test-app localhost:5001/registry/$t-test-app
docker push localhost:5001/registry/$t-test-app
done

- name: Set up Helm
uses: azure/setup-helm@v4
with:
version: v3.11.2

- name: Install Operator Skeleton
run: |
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.15.3/cert-manager.yaml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed only for apm tests right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT shall we pin the versions of certmanager and azure/setup-helm@v4 or use the latest? In order those tests to catch any possible issues for the users as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed for the operator install unless the helm chart does that for you

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use the Onboarding steps, this won't be needed as cert-manager is disabled.

bash test/operator/wait_for_pod_start.sh cert-manager cert-manager- 1/1 3
helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
helm repo update
helm install opentelemetry-operator --namespace opentelemetry-operator-system open-telemetry/opentelemetry-operator --create-namespace --set manager.collectorImage.repository="docker.elastic.co/beats/elastic-agent:8.15.0-SNAPSHOT",manager.extraArgs={"--enable-go-instrumentation=true"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you would need 8.16.0-SNAPSHOT image for this to work. This is because with previous versions you would need to override the container's command in order to launch the otel collector instead of the elastic-agent binary; elastic/elastic-agent#5248

In addition to that, the ELASTIC_AGENT_OTEL environment variable need to be defined too. Example https://github.com/elastic/opentelemetry/pull/11/files#diff-99b053f068c4e75b4029bd9dca1f76667178c4c2174db8313605c0a440e0fc37R22

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a skeleton implementation. I'm expecting this to use exactly the instructions we provide to users, rather than this skeleton. Let's do that in a different PR, as it needs syncing the installation with the readme instructions so not straightforward

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts --force update
(and delete line 43

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the preference, update it in https://github.com/elastic/opentelemetry/blob/main/docs/onboarding/8_16/operator/README.md because the upcoming version of this will use exactly what is in there

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The opentelemetry-operator-system namespace should be created as well in prereq step is not it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above, will use the actual installation procedure we specify in the readme. in another PR

bash test/operator/wait_for_pod_start.sh opentelemetry-operator-system opentelemetry-operator 2/2 1
kubectl get pods -A

- name: Add Namespaces And Instrumentation Skeleton
run: |
kubectl create namespace banana
kubectl create -f test/operator/elastic-instrumentation.yml

- name: Start And Test Collector Skeleton
run: |
echo "Nothing here yet"

- name: Start Test Images
run: |
for t in ${AGENT_TESTS[@]}
do
if [ "x$t" = "xgo" ]; then CONTAINER_READY="2/2"; else CONTAINER_READY="1/1"; fi
AGENT_START_GREP=`grep -A1 AGENT_HAS_STARTED_IF_YOU_SEE test/operator/$t/test-app.yaml | perl -ne '/value:\s*"(.*)"/ && print "$1\n"'`
echo "Starting pod for $t"
kubectl create -f test/operator/$t/test-app.yaml
bash test/operator/wait_for_pod_start.sh banana $t-test-app $CONTAINER_READY 1
bash test/operator/wait_for_agent_start.sh banana $t-test-app "$AGENT_START_GREP"
kubectl delete -f test/operator/$t/test-app.yaml
done
17 changes: 17 additions & 0 deletions test/operator/dotnet/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# https://hub.docker.com/_/microsoft-dotnet
FROM mcr.microsoft.com/dotnet/sdk:8.0 AS build
WORKDIR /source

# copy csproj and restore as distinct layers
COPY *.csproj .
RUN dotnet restore

# copy everything else and build app
COPY . .
RUN dotnet publish -c release -o /app --no-restore

# final stage/image
FROM mcr.microsoft.com/dotnet/aspnet:8.0
WORKDIR /app
COPY --from=build /app ./
ENTRYPOINT ["dotnet", "dotnetapp.dll"]
7 changes: 7 additions & 0 deletions test/operator/dotnet/Program.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
var builder = WebApplication.CreateBuilder(args);
_ = builder.Logging.SetMinimumLevel(LogLevel.Trace);
var app = builder.Build();

app.MapGet("/", () => "Hello World!");

app.Run();
9 changes: 9 additions & 0 deletions test/operator/dotnet/dotnetapp.csproj
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
<Project Sdk="Microsoft.NET.Sdk.Web">

<PropertyGroup>
<TargetFramework>net8.0</TargetFramework>
<Nullable>enable</Nullable>
<ImplicitUsings>enable</ImplicitUsings>
</PropertyGroup>

</Project>
21 changes: 21 additions & 0 deletions test/operator/dotnet/test-app.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
apiVersion: v1
kind: Pod
metadata:
name: dotnet-test-app
namespace: banana
annotations:
instrumentation.opentelemetry.io/inject-dotnet: "opentelemetry-operator-system/elastic-instrumentation"
labels:
app: dotnet-test-app
spec:
containers:
- image: localhost:5001/registry/dotnet-test-app
imagePullPolicy: Always
name: dotnet-test-app
env:
- name: OTEL_LOG_LEVEL
value: "debug"
- name: ELASTIC_OTEL_LOG_TARGETS
value: "stdout"
- name: AGENT_HAS_STARTED_IF_YOU_SEE
value: "Elastic Distribution of OpenTelemetry .NET"
25 changes: 25 additions & 0 deletions test/operator/elastic-instrumentation.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
name: elastic-instrumentation
namespace: opentelemetry-operator-system
spec:
exporter:
endpoint: http://opentelemetry-kube-stack-daemon-collector:4318
propagators:
- tracecontext
- baggage
- b3
sampler:
type: parentbased_traceidratio
argument: "1.0"
java:
image: docker.elastic.co/observability/elastic-otel-javaagent:1.0.0
nodejs:
image: docker.elastic.co/observability/elastic-otel-node:edge
dotnet:
image: docker.elastic.co/observability/elastic-otel-dotnet:edge
python:
image: docker.elastic.co/observability/elastic-otel-python:edge
go:
image: ghcr.io/open-telemetry/opentelemetry-go-instrumentation/autoinstrumentation-go:v0.14.0-alpha
3 changes: 3 additions & 0 deletions test/operator/go/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#dummy implementation for now
FROM busybox
CMD ["bash", "-c", "sleep", "6000"]
25 changes: 25 additions & 0 deletions test/operator/go/test-app.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
apiVersion: v1
kind: Pod
metadata:
name: go-test-app
namespace: banana
annotations:
instrumentation.opentelemetry.io/inject-go: "opentelemetry-operator-system/elastic-instrumentation"
instrumentation.opentelemetry.io/otel-go-auto-target-exe: "/usr/src/app/productcatalogservice"
sidecar.opentelemetry.io/inject: "true"
labels:
app: go-test-app
spec:
shareProcessNamespace: true
containers:
- image: ghcr.io/open-telemetry/opentelemetry-operator/e2e-test-app-golang:main
imagePullPolicy: Always
name: go-test-app
env:
- name: OTEL_LOG_LEVEL
value: "debug"
- name: AGENT_HAS_STARTED_IF_YOU_SEE
value: "no idea"
securityContext:
runAsUser: 0
privileged: true
3 changes: 3 additions & 0 deletions test/operator/java/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
FROM eclipse-temurin:17
COPY Hello.java /usr/src/Hello.java
CMD ["java", "/usr/src/Hello.java"]
12 changes: 12 additions & 0 deletions test/operator/java/Hello.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
public class Hello{
public static void main(String[] args) throws Exception {
System.out.println("This is java app in a container");
for(;;) {
Thread.sleep(2000L);
test();
}
}
public static void test() {
System.out.println("Executing test()");
}
}
19 changes: 19 additions & 0 deletions test/operator/java/test-app.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
apiVersion: v1
kind: Pod
metadata:
name: java-test-app
namespace: banana
annotations:
instrumentation.opentelemetry.io/inject-java: "opentelemetry-operator-system/elastic-instrumentation"
labels:
app: java-test-app
spec:
containers:
- image: localhost:5001/registry/java-test-app
imagePullPolicy: Always
name: java-test-app
env:
- name: OTEL_JAVAAGENT_DEBUG
value: "true"
- name: AGENT_HAS_STARTED_IF_YOU_SEE
value: "javaagent.tooling.VersionLogger - opentelemetry-javaagent"
42 changes: 42 additions & 0 deletions test/operator/kind-with-registry.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
#!/bin/sh
set -o errexit
#https://kind.sigs.k8s.io/docs/user/local-registry/

# create registry container unless it already exists
reg_name='kind-registry'
reg_port='5001'
if [ "$(docker inspect -f '{{.State.Running}}' "${reg_name}" 2>/dev/null || true)" != 'true' ]; then
docker run \
-d --restart=always -p "127.0.0.1:${reg_port}:5000" --name "${reg_name}" \
registry:2
fi

# create a cluster with the local registry enabled in containerd
cat <<EOF | kind create cluster --config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
containerdConfigPatches:
- |-
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."localhost:${reg_port}"]
endpoint = ["http://${reg_name}:5000"]
EOF

# connect the registry to the cluster network if not already connected
if [ "$(docker inspect -f='{{json .NetworkSettings.Networks.kind}}' "${reg_name}")" = 'null' ]; then
docker network connect "kind" "${reg_name}"
fi

# Document the local registry
# https://github.com/kubernetes/enhancements/tree/master/keps/sig-cluster-lifecycle/generic/1755-communicating-a-local-registry
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
name: local-registry-hosting
namespace: kube-public
data:
localRegistryHosting.v1: |
host: "localhost:${reg_port}"
help: "https://kind.sigs.k8s.io/docs/user/local-registry/"
EOF

5 changes: 5 additions & 0 deletions test/operator/nodejs/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
FROM node

ADD ./app.js .

ENTRYPOINT [ "node", "app.js" ]
8 changes: 8 additions & 0 deletions test/operator/nodejs/app.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
var http = require("http");

http.createServer(function (request, response) {
response.writeHead(200, {'Content-Type': 'text/plain'});
response.end('Hello from kubernetes\n');
}).listen(8080);

console.log('Server running at http://127.0.0.1:8080/');
19 changes: 19 additions & 0 deletions test/operator/nodejs/test-app.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
apiVersion: v1
kind: Pod
metadata:
name: nodejs-test-app
namespace: banana
annotations:
instrumentation.opentelemetry.io/inject-nodejs: "opentelemetry-operator-system/elastic-instrumentation"
labels:
app: nodejs-test-app
spec:
containers:
- image: localhost:5001/registry/nodejs-test-app
imagePullPolicy: Always
name: nodejs-test-app
env:
- name: OTEL_LOG_LEVEL
value: "debug"
- name: AGENT_HAS_STARTED_IF_YOU_SEE
value: "@opentelemetry/instrumentation-http Applying instrumentation patch for nodejs core module"
9 changes: 9 additions & 0 deletions test/operator/python/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
FROM python:3.9-slim-buster

WORKDIR /app

COPY . /app

RUN pip install --no-cache-dir -r requirements.txt

CMD [ "python3", "-m" , "flask", "run"]
8 changes: 8 additions & 0 deletions test/operator/python/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# app.py
from flask import Flask

app = Flask(__name__)

@app.route("/")
def hello_world():
return "Hello, World!"
7 changes: 7 additions & 0 deletions test/operator/python/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
blinker==1.8.2
click==8.1.7
Flask==3.0.3
itsdangerous==2.2.0
Jinja2==3.1.4
MarkupSafe==2.1.5
Werkzeug==3.0.4
19 changes: 19 additions & 0 deletions test/operator/python/test-app.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
apiVersion: v1
kind: Pod
metadata:
name: python-test-app
namespace: banana
annotations:
instrumentation.opentelemetry.io/inject-python: "opentelemetry-operator-system/elastic-instrumentation"
labels:
app: python-test-app
spec:
containers:
- image: localhost:5001/registry/python-test-app
imagePullPolicy: Always
name: python-test-app
env:
- name: OTEL_LOG_LEVEL
value: "debug"
- name: AGENT_HAS_STARTED_IF_YOU_SEE
value: "Exception while exporting metrics HTTPConnectionPool"
28 changes: 28 additions & 0 deletions test/operator/wait_for_agent_start.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
#!/bin/bash

set -euxo pipefail

MAX_WAIT_SECONDS=60
NAMESPACE=$1
POD_NAME=$2
GREP=$3

echo "Waiting up to $MAX_WAIT_SECONDS seconds for the agent to start in $NAMESPACE/$POD_NAME"
count=0
while [ $count -lt $MAX_WAIT_SECONDS ]
do
count=`expr $count + 1`
STARTED=$(kubectl logs $POD_NAME -n $NAMESPACE | (grep "$GREP" || true) | wc -l)
if [ $STARTED -eq 1 ]
then
exit 0
fi
sleep 1
done

echo "error: the $NAMESPACE/$POD_NAME pod failed to start an agent within $MAX_WAIT_SECONDS seconds"
echo "-- pod info:"
kubectl logs $POD_NAME -n $NAMESPACE
kubectl describe pod/$POD_NAME -n $NAMESPACE
echo "--"
exit 1
Loading