Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky test: [tcp routing] TCP Routing external ports with a second external port [It] maps both ports to the same application #1173

Open
jochenehret opened this issue Jul 11, 2024 · 2 comments

Comments

@jochenehret
Copy link
Contributor

The TCP Routing test that checks if one app can be reached from two ports is failing often here:

Expect(err).ToNot(HaveOccurred())

Example failures:
https://concourse.wg-ard.ci.cloudfoundry.org/teams/main/pipelines/cf-deployment/jobs/fips-cats/builds/82
https://concourse.wg-ard.ci.cloudfoundry.org/teams/main/pipelines/cf-deployment/jobs/fips-cats/builds/57
https://concourse.wg-ard.ci.cloudfoundry.org/teams/main/pipelines/cf-deployment/jobs/fips-cats/builds/113
https://concourse.wg-ard.ci.cloudfoundry.org/teams/main/pipelines/cf-deployment/jobs/fips-cats/builds/120

I've recreated the test setup manually on fips/snape. The setup works as expected: You can send data over two different TCP ports to the test app and the app responds as expected. Running the test in the CATs suite however fails often.

I've added some debug statements with timestamps. Here's the flow from a failed run:

# sending first test message to first port
# https://github.com/cloudfoundry/cf-acceptance-tests/blob/6f060209f7a55f0c4f8d0fffabb122c785ce914e/cats_suite_helpers/cats_suite_helpers.go#L406
starting SendAndReceive(tcp.cf.snape.env.wg-ard.ci.cloudfoundry.org, 1031) at Jul 11 14:49:45.862

# output from test app: https://github.com/cloudfoundry/cf-acceptance-tests/blob/6f060209f7a55f0c4f8d0fffabb122c785ce914e/assets/tcp-listener/main.go#L53
# "10.0.32.11" is one of the two tcp-routers
2024-07-11T12:49:45.97+0000 [APP/PROC/WEB/0] OUT Message to 10.0.32.11:41084: server1:Time is 938260798
2024-07-11T12:49:45.99+0000 [APP/PROC/WEB/0] OUT Jul 11 14:49:45.991 (read) Closing connection to 10.0.32.11:41084: EOF

# sending second test message to other port
starting SendAndReceive(tcp.cf.snape.env.wg-ard.ci.cloudfoundry.org, 1026) at Jul 11 14:49:45.955

# now we are failing here when reading the response:
# https://github.com/cloudfoundry/cf-acceptance-tests/blob/6f060209f7a55f0c4f8d0fffabb122c785ce914e/cats_suite_helpers/cats_suite_helpers.go#L437
Jul 11 14:54:46.575 error3: EOF
buff is:

When the second message is sent, the conn.Write(message) statement returns no error:


However, the test app doesn't seem to receive the message. There is no "Message to" log statement. What happens next is an error at the conn.Read(buff) statement:

Error is "EOF" and the buffer is empty.

Looks like a race condition. The Read function is probably called before the test app starts to write and fails immediately with EOF?

@jochenehret
Copy link
Contributor Author

PR was merged 5 days ago. So far no failures. Let's observe a few more days before we close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant