Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduced 'Pending' state for backend's health #91

Closed
wants to merge 1 commit into from

Conversation

andythsu
Copy link
Member

This commit adds a 'Pending' state to keep track of backend's health.

Resolves: #80

Copy link

cla-bot bot commented Nov 13, 2023

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Andy Su (Apps).
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

This commit adds a 'Pending' state to keep track of backend's health.

Resolves: trinodb#80
@@ -109,21 +110,11 @@ public String findBackendForQueryId(String queryId) {
return backendAddress;
}

public void upateBackEndHealth(String backendId, Boolean value) {
public void upateBackEndHealth(String backendId, BackendHealthState value) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: because you're modifying the method signature we might as well fix the typo and consistency of BackEnd vs Backend: upateBackEndHealth -> updateBackendHealth

(feel free to resolve with or without the fix, i'm poking my head in here and might not see the response :D)

@@ -86,7 +114,7 @@ public void testCustomPath() throws Exception {
.addHeader("X-Trino-Routing-Group", "custom")
.build();
Response response2 = httpClient.newCall(request2).execute();
assertEquals(response2.code(), 404);
assertEquals(404, response2.code());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: potentially use jakarta.ws.rs.core.Response.Status.NOT_FOUND.getStatusCode() or HttpURLConnection.HTTP_NOT_FOUND (<-- preference) here instead of the originally magic number?

(note: feel free to resolve with or without making the modifications)

}

//Keep only 1st backend as healthy, mark all the others as unhealthy
assert (!backendManager.getAllActiveBackends().isEmpty());

for (int i = 1; i < numBackends; i++) {
backend = groupName + i;
haRoutingManager.upateBackEndHealth(backend, false);
haRoutingManager.upateBackEndHealth(backend, BackendHealthState.UNHEALTHY);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: perhaps here we want to hit all enum states in the test?

for ex: 50% pending and 50% unhealthy which both represent "not yet healthy"
-> i % 2 == 0 ? BackendHealthState.UNHEALTHY : BackendHealthState.PENDING

(note: adding this note to all blocks as i might not be able to get back to them; feel free to resolve with or without change)


@TestInstance(Lifecycle.PER_CLASS)
@ExtendWith(DropwizardExtensionsSupport.class)
@Slf4j
public class TestGatewayHaSingleBackend {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: is the goal of this test case to be a sub-system test?

my thinking is: if we just wanted to test that the gateway chooses backends properly based on health as a single unit you could replace:

    SECONDS.sleep(WAIT_FOR_BACKEND_IN_SECONDS);

with:

    HaGatewayLauncher haGatewayLauncher = new HaGatewayLauncher("io.trino");
    haGatewayLauncher.run(args);

    // Now populate the backend
    setUpBackend(
            "trino1", "http://localhost:" + backendPort, "externalUrl", true, "adhoc", routerPort);

    // When a backend becomes active for the first time OR is transitioned from inactive to active it will not yet be
    // treated as healthy. Without a real health check happening we will extract the RoutingManager and set them to healthy manually
    Field privateInjectorField = BaseApp.class.
            getDeclaredField("injector");
    privateInjectorField.setAccessible(true);

    Injector injector = (Injector) privateInjectorField.get(haGatewayLauncher);
    injector.getInstance(RoutingManager.class).updateBackendHealth("trino1", BackendHealth.HEALTHY);

This hack to grab the class with reflection is just to keep the test as close to the module as possible (and do without a sleep pause). Curious on your thoughts for it.

If the goal is to cover as much repository as possible in the test as sub-system instead of integration then fair enough 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great idea! included this in my new PR

@andythsu andythsu closed this Jan 17, 2024
@andythsu andythsu deleted the pending_state branch September 18, 2024 20:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Feature request: Introduce different states to prevent clusters from being turned inactive programmatically
2 participants