Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create addon boot failed issue for repair #5397

Merged
merged 2 commits into from
Nov 7, 2024
Merged

Conversation

mdegat01
Copy link
Contributor

@mdegat01 mdegat01 commented Nov 6, 2024

Proposed change

Create issue and suggestions when an addon fails at boot of supervisor so it pops the repair added in home-assistant/core#129847

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New feature (which adds functionality to the supervisor)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Additional information

  • This PR fixes or closes issue: fixes #
  • This PR is related to issue:
  • Link to documentation pull request:
  • Link to cli pull request:
  • Link to client library pull request:

Checklist

  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • I have followed the development checklist
  • The code has been formatted using Ruff (ruff format supervisor tests)
  • Tests have been added to verify that the new code works.

If API endpoints or add-on configuration are added/changed:

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced enhanced error handling for add-on boot processes, including new issue types for boot failures.
    • Added functionality to disable automatic boot for add-ons and manage their startup processes more effectively.
  • Bug Fixes

    • Improved resolution system to correctly log and manage issues related to add-on states during boot attempts.
  • Tests

    • Expanded test coverage for boot failure scenarios and add-on management, ensuring robust functionality and error handling.
    • Added new tests for the FixupAddonDisableBoot and FixupAddonExecuteStart functionalities.

@mdegat01 mdegat01 added the new-feature A new feature label Nov 6, 2024
@mdegat01 mdegat01 requested a review from agners November 6, 2024 21:38
Copy link

coderabbitai bot commented Nov 6, 2024

📝 Walkthrough
📝 Walkthrough

Walkthrough

The changes in this pull request involve enhancements to the error handling and boot management of add-ons within the supervisor system. Key modifications include the addition of properties and methods to manage boot failure issues, updates to the Addon and AddonManager classes, and the introduction of new fixup classes for disabling and executing add-on starts. Additionally, new constants related to issue types and suggestions have been added, along with corresponding test cases to validate the new functionalities and ensure robust error reporting.

Changes

File Change Summary
supervisor/addons/addon.py Added property boot_failed_issue to Addon class. Updated state and boot setter methods to dismiss boot_failed_issue under specific conditions.
supervisor/addons/manager.py Removed imports for specific exceptions. Updated boot method to handle a broader range of errors and log warnings for add-ons that fail to start.
supervisor/resolution/const.py Added new members: BOOT_FAIL to IssueType, and DISABLE_BOOT, EXECUTE_START to SuggestionType.
supervisor/resolution/fixups/addon_disable_boot.py Introduced FixupAddonDisableBoot class with methods to manage boot configurations and properties for suggestions and issues related to boot failures.
supervisor/resolution/fixups/addon_execute_start.py Introduced FixupAddonExecuteStart class with methods for starting add-ons and handling errors during the start process.
tests/addons/test_addon.py Added tests for boot failure scenarios, including test_addon_start_dismisses_boot_fail and test_addon_disable_boot_dismisses_boot_fail.
tests/addons/test_manager.py Defined constants for BOOT_FAIL_ISSUE and BOOT_FAIL_SUGGESTIONS. Updated tests to assert these new structures during boot error scenarios.
tests/resolution/fixup/test_addon_disable_boot.py Added tests for FixupAddonDisableBoot to verify functionality when disabling add-on boot.
tests/resolution/fixup/test_addon_execute_start.py Introduced tests for FixupAddonExecuteStart to validate behavior during add-on start processes and error handling.
tests/resolution/fixup/test_addon_execute_repair.py Updated description to focus on "Test fixup addon execute repair" and ensured tests validate repair logic under various conditions.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant AddonManager
    participant Addon
    participant ResolutionSystem

    User->>AddonManager: Start Addon
    AddonManager->>Addon: Attempt to Start
    alt Success
        Addon->>AddonManager: Started
        AddonManager->>ResolutionSystem: Dismiss boot failure issue
    else Failure
        Addon->>AddonManager: Failed to Start
        AddonManager->>ResolutionSystem: Log warning, create issue
    end
Loading

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 335a78c and dc98dbf.

📒 Files selected for processing (2)
  • supervisor/addons/addon.py (4 hunks)
  • supervisor/addons/manager.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • supervisor/addons/manager.py
🔇 Additional comments (6)
supervisor/addons/addon.py (6)

84-85: LGTM!

The new imports are necessary for the boot failure issue functionality and follow the existing import style.


148-150: LGTM! Good implementation of issue caching.

The implementation correctly caches the boot failure issue object as suggested in the previous review, avoiding unnecessary object creation on each property access.


156-160: LGTM!

The property is well-documented and provides clean access to the boot failure issue.


178-184: LGTM! Good error state cleanup.

The code correctly dismisses the boot failure issue when the add-on successfully starts, ensuring proper cleanup of error states.


341-347: LGTM! Good handling of manual boot configuration.

The code correctly dismisses the boot failure issue when the add-on is configured for manual boot, preventing false positives for add-ons that are intentionally not started on boot.


148-150: Verify issue creation in AddonManager.

The code properly initializes and manages the boot failure issue, but we should verify that it's correctly added to sys_resolution when a boot failure occurs.

✅ Verification successful

Let me gather more context about how the boot failure issue is handled.


Let me check one more thing to see the complete context of where these issues are added.


Boot failure issue is properly integrated and handled

The implementation is complete. The boot_failed_issue is correctly added to sys_resolution in two scenarios:

  • When an addon fails to start with a HassioError
  • When an addon ends up in ERROR or UNKNOWN state after boot

The issue is properly created in the addon class and utilized in the manager for error handling, with appropriate suggestions for resolution (EXECUTE_START, DISABLE_BOOT).

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify boot failure issue is added to sys_resolution in AddonManager

# Search for usage of boot_failed_issue in AddonManager
rg -A 5 "boot_failed_issue" "supervisor/addons/manager.py"

Length of output: 529


Script:

#!/bin/bash
# Search for the method that handles boot failures and issue creation
ast-grep --pattern 'async def start($$$) {
  $$$
  boot_failed_issue
  $$$
}'

# Also search for any direct references to sys_resolution
rg -A 5 "sys_resolution" "supervisor/addons/manager.py"

Length of output: 927


Script:

#!/bin/bash
# Get more context around where boot_failed_issue is used
rg -B 10 -A 10 "boot_failed_issue" "supervisor/addons/manager.py"

# Also check for any error handling in start method
ast-grep --pattern 'async def start($$$) {
  try {
    $$$
  } catch ($$$) {
    $$$
  }
}'

Length of output: 1984


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai or @coderabbitai title anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (21)
tests/resolution/fixup/test_addon_disable_boot.py (3)

9-9: Consider moving BOOT_FAIL_ISSUE to a dedicated constants file.

The test constant BOOT_FAIL_ISSUE is currently imported from test_manager. Consider moving it to a dedicated test constants file for better organization and reusability across test files.


12-27: Enhance test coverage and documentation.

The test case could be improved in the following ways:

  1. Add assertion to verify initial boot state
  2. Add test case for when addon is already in MANUAL boot state
  3. Enhance docstring to describe the test setup and expected outcomes

Consider updating the test like this:

 async def test_fixup(coresys: CoreSys, install_addon_ssh: Addon):
-    """Test fixup disables boot."""
+    """Test fixup disables addon boot.
+    
+    Test that:
+    1. Addon boot state changes from AUTO to MANUAL
+    2. Resolution issues and suggestions are cleared
+    """
     install_addon_ssh.boot = AddonBoot.AUTO
+    assert install_addon_ssh.boot == AddonBoot.AUTO  # Verify initial state
     addon_disable_boot = FixupAddonDisableBoot(coresys)
     assert addon_disable_boot.auto is False

Also, consider adding a new test case:

async def test_fixup_already_manual(coresys: CoreSys, install_addon_ssh: Addon):
    """Test fixup when addon is already in manual boot state."""
    install_addon_ssh.boot = AddonBoot.MANUAL
    addon_disable_boot = FixupAddonDisableBoot(coresys)

    coresys.resolution.add_issue(
        BOOT_FAIL_ISSUE,
        suggestions=[SuggestionType.DISABLE_BOOT],
    )
    await addon_disable_boot()

    assert install_addon_ssh.boot == AddonBoot.MANUAL
    assert not coresys.resolution.issues
    assert not coresys.resolution.suggestions

29-40: Enhance test documentation and assertions.

The test case for missing addon could be improved by:

  1. Enhancing the docstring to better describe the test scenario
  2. Adding assertions to verify that no unintended state changes occurred

Consider updating the test like this:

 async def test_fixup_no_addon(coresys: CoreSys):
-    """Test fixup dismisses if addon is missing."""
+    """Test fixup behavior when addon is missing.
+    
+    Test that:
+    1. Fixup runs without errors when addon is missing
+    2. Resolution issues and suggestions are cleared
+    3. No unintended state changes occur
+    """
     addon_disable_boot = FixupAddonDisableBoot(coresys)
+    initial_state = coresys.resolution.state.copy()  # Store initial state

     coresys.resolution.add_issue(
         BOOT_FAIL_ISSUE,
         suggestions=[SuggestionType.DISABLE_BOOT],
     )
     await addon_disable_boot()

     assert not coresys.resolution.issues
     assert not coresys.resolution.suggestions
+    # Verify no other state changes occurred
+    current_state = coresys.resolution.state
+    del initial_state["issues"]
+    del initial_state["suggestions"]
+    del current_state["issues"]
+    del current_state["suggestions"]
+    assert current_state == initial_state
supervisor/resolution/fixups/addon_disable_boot.py (4)

1-2: Enhance module docstring for better documentation.

The current docstring is brief and could be more descriptive. Consider expanding it to include:

  • Purpose of the fixup
  • When it's triggered
  • Expected outcomes
-"""Helpers to fix addon by disabling boot."""
+"""Fixup handler for addon boot failures.
+
+This module provides functionality to handle addon boot failures by disabling
+automatic boot. It is triggered when an addon fails to start and provides
+a manual intervention option for users to prevent continuous boot attempts.
+"""

13-16: Improve setup function docstring.

The current docstring doesn't provide enough information about the function's purpose and return value.

-    """Check setup function."""
+    """Set up the addon boot disable fixup.
+
+    Args:
+        coresys: The core system instance.
+
+    Returns:
+        FixupBase: An instance of FixupAddonDisableBoot.
+    """

18-20: Add type hints for addon reference in class docstring.

The class docstring should document the expected type for addon references.

 class FixupAddonDisableBoot(FixupBase):
-    """Storage class for fixup."""
+    """Fixup handler for disabling addon automatic boot.
+
+    This fixup is triggered when an addon fails to boot and provides
+    functionality to switch it to manual boot mode.
+
+    Attributes:
+        reference (str): The addon slug (e.g., 'core_ssh')
+    """

46-48: Fix typo in auto property docstring.

There's a grammatical error in the docstring.

     @property
     def auto(self) -> bool:
-        """Return if a fixup can be apply as auto fix."""
+        """Return if this fixup can be applied as an auto fix."""
supervisor/resolution/fixups/addon_execute_start.py (1)

14-17: Enhance the setup function's docstring.

The current docstring "Check setup function" doesn't clearly describe the function's purpose of creating and returning a FixupAddonExecuteStart instance.

Consider updating to:

-    """Check setup function."""
+    """Create and return a FixupAddonExecuteStart instance.
+
+    Args:
+        coresys: The core system instance.
+
+    Returns:
+        FixupBase: An instance of FixupAddonExecuteStart.
+    """
tests/resolution/fixup/test_addon_execute_repair.py (4)

Line range hint 19-39: Consider enhancing test assertions for more thorough coverage.

While the test covers the basic flow well, consider adding assertions to verify:

  1. The state of the addon before and after repair
  2. The exact number of times the installation was attempted
  3. The specific parameters passed to the installation call

Example enhancement:

 with patch.object(DockerInterface, "install") as install:
     await addon_execute_repair()
     install.assert_called_once()
+    install.assert_called_with(version=install_addon_ssh.version)
+    assert install_addon_ssh.state == "started"

Line range hint 41-64: Enhance failure tracking in max attempts test.

The test effectively verifies the auto-repair limit but could benefit from tracking the progression of failures more explicitly.

Consider adding:

 with patch.object(DockerInterface, "install", side_effect=DockerError):
+    attempt = 0
     for _ in range(5):
+        attempt += 1
         assert addon_execute_repair.auto is True
         with pytest.raises(DockerError):
             await addon_execute_repair()
+        # Verify the attempt count is tracked
+        assert addon_execute_repair.failed_attempts == attempt
+        # Verify error is logged
+        assert f"Addon repair failed - attempt {attempt}/5" in caplog.text

Line range hint 66-82: Add resolution state verification for missing addon case.

The test verifies that no installation is attempted, but should also check the final state of the resolution system.

Consider adding:

 with patch.object(DockerAddon, "install") as install:
     await addon_execute_repair()
     install.assert_not_called()
+    # Verify the issue is still present as it can't be fixed
+    assert any(
+        issue.type == IssueType.MISSING_IMAGE
+        for issue in coresys.resolution.issues
+    )
+    # Verify the suggestion remains
+    assert any(
+        suggestion.type == SuggestionType.EXECUTE_REPAIR
+        for suggestion in coresys.resolution.suggestions
+    )

Line range hint 84-103: Enhance mock verification for existing image case.

The test should verify the interaction with the Docker mock more thoroughly.

Consider adding:

 docker.images.get.return_value = MagicMock()
+# Set up mock image details
+mock_image = docker.images.get.return_value
+mock_image.attrs = {"Id": "sha256:123456", "Created": "2024-01-01T00:00:00Z"}
+
 addon_execute_repair = FixupAddonExecuteRepair(coresys)
 assert addon_execute_repair.auto is True

 coresys.resolution.create_issue(
     IssueType.MISSING_IMAGE,
     ContextType.ADDON,
     reference="local_ssh",
     suggestions=[SuggestionType.EXECUTE_REPAIR],
 )

 with patch.object(DockerAddon, "install") as install:
     await addon_execute_repair()
     install.assert_not_called()
+    # Verify Docker interaction
+    docker.images.get.assert_called_once_with(install_addon_ssh.image)
+    # Verify issue is resolved
+    assert not coresys.resolution.issues
+    assert not coresys.resolution.suggestions
tests/resolution/fixup/test_addon_execute_start.py (5)

16-16: Consider relocating test constants.

The BOOT_FAIL_ISSUE constant is imported from test_manager.py, but it appears to be a shared test constant. Consider moving shared test constants to a dedicated test constants file (e.g., tests/const.py) for better maintainability and reusability.


27-50: Enhance state transition assertions.

While the test verifies the final state and cleanup, it could be more explicit about the state transitions. Consider adding assertions to verify:

  1. The initial state change to UNKNOWN
  2. The intermediate state transitions during startup
  3. The final expected state

Example improvement:

 async def test_fixup(coresys: CoreSys, install_addon_ssh: Addon, state: AddonState):
     """Test fixup starts addon."""
     install_addon_ssh.state = AddonState.UNKNOWN
+    assert install_addon_ssh.state == AddonState.UNKNOWN  # Verify initial state
     addon_execute_start = FixupAddonExecuteStart(coresys)
     assert addon_execute_start.auto is False

     async def mock_start(*args, **kwargs):
         install_addon_ssh.state = state
+        assert install_addon_ssh.state == state  # Verify state transition

53-71: Enhance Docker error testing specificity.

The test currently uses a generic DockerError. Consider testing specific Docker error scenarios that could occur during addon startup, such as image pull failures, resource constraints, or network issues.

Example improvement:

-    with patch.object(DockerAddon, "run", side_effect=DockerError) as run:
+    with patch.object(
+        DockerAddon,
+        "run",
+        side_effect=DockerError("Failed to pull image: network timeout")
+    ) as run:

101-117: Add edge cases for missing addon scenarios.

Consider adding test cases for related edge cases:

  1. Addon exists but is not installed
  2. Addon exists but is partially installed/corrupted
  3. Multiple addons with similar names

1-2: Enhance test module documentation.

Consider expanding the module docstring to include:

  1. Overview of the test scenarios covered
  2. Prerequisites for running the tests
  3. Any special setup or teardown requirements
  4. Examples of how these tests relate to real-world scenarios
supervisor/addons/manager.py (1)

133-145: Consider extracting duplicate issue creation logic.

The issue creation code is duplicated between the exception handler and the post-boot state check. Consider extracting this into a helper method for better maintainability.

Here's a suggested refactor:

+    def _create_boot_failed_issue(self, addon: Addon) -> None:
+        """Create a boot failed issue for an addon."""
+        self.sys_resolution.add_issue(
+            addon.boot_failed_issue,
+            suggestions=[
+                SuggestionType.EXECUTE_START,
+                SuggestionType.DISABLE_BOOT,
+            ],
+        )

     async def boot(self, stage: AddonStartup) -> None:
         # ... existing code ...
         try:
             if start_task := await addon.start():
                 wait_boot.append(start_task)
         except HassioError:
-            self.sys_resolution.add_issue(
-                addon.boot_failed_issue,
-                suggestions=[
-                    SuggestionType.EXECUTE_START,
-                    SuggestionType.DISABLE_BOOT,
-                ],
-            )
+            self._create_boot_failed_issue(addon)
         else:
             continue

         # ... existing code ...

         for addon in tasks:
             if addon.state in {AddonState.ERROR, AddonState.UNKNOWN}:
-                self.sys_resolution.add_issue(
-                    addon.boot_failed_issue,
-                    suggestions=[
-                        SuggestionType.EXECUTE_START,
-                        SuggestionType.DISABLE_BOOT,
-                    ],
-                )
+                self._create_boot_failed_issue(addon)
tests/addons/test_manager.py (1)

53-53: LGTM! Improved fixture type annotations.

The fixture signatures have been properly updated to use AsyncGenerator/Generator types, which better reflects their actual behavior and improves type safety.

Consider adding type annotations for the yield value in the docstrings to make it even clearer what these fixtures provide, e.g.:

"""Mock supported arch and disk space.

Yields:
    None: This fixture only provides side effects.
"""

Also applies to: 63-63, 69-71

supervisor/addons/addon.py (2)

338-344: LGTM: Appropriate cleanup when changing boot mode.

The code correctly dismisses the boot failure issue when the boot mode is changed to manual, as the addon is no longer expected to start automatically.

Consider adding a code comment explaining why the boot failure issue is dismissed when changing to manual boot mode, as this relationship might not be immediately obvious to future maintainers.

 @boot.setter
 def boot(self, value: AddonBoot) -> None:
     """Store user boot options."""
     self.persist[ATTR_BOOT] = value

+    # Dismiss boot failure issue when manual boot is selected as the addon
+    # is no longer expected to start automatically
     if (
         value == AddonBoot.MANUAL
         and self.boot_failed_issue in self.sys_resolution.issues
     ):
         self.sys_resolution.dismiss_issue(self.boot_failed_issue)

175-181: Consider adding error handling for issue dismissal.

The calls to sys_resolution.dismiss_issue could potentially raise exceptions. Consider wrapping these calls in try-except blocks to ensure robustness.

     # Dismiss boot failed issue if present and we started
     if (
         new_state == AddonState.STARTED
         and self.boot_failed_issue in self.sys_resolution.issues
     ):
-        self.sys_resolution.dismiss_issue(self.boot_failed_issue)
+        try:
+            self.sys_resolution.dismiss_issue(self.boot_failed_issue)
+        except Exception as err:
+            _LOGGER.warning("Failed to dismiss boot failure issue: %s", err)

     # Dismiss boot failed issue if present and boot at start disabled
     if (
         value == AddonBoot.MANUAL
         and self.boot_failed_issue in self.sys_resolution.issues
     ):
-        self.sys_resolution.dismiss_issue(self.boot_failed_issue)
+        try:
+            self.sys_resolution.dismiss_issue(self.boot_failed_issue)
+        except Exception as err:
+            _LOGGER.warning("Failed to dismiss boot failure issue: %s", err)

Also applies to: 338-344

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between b29bc23 and 335a78c.

📒 Files selected for processing (10)
  • supervisor/addons/addon.py (4 hunks)
  • supervisor/addons/manager.py (2 hunks)
  • supervisor/resolution/const.py (2 hunks)
  • supervisor/resolution/fixups/addon_disable_boot.py (1 hunks)
  • supervisor/resolution/fixups/addon_execute_start.py (1 hunks)
  • tests/addons/test_addon.py (3 hunks)
  • tests/addons/test_manager.py (7 hunks)
  • tests/resolution/fixup/test_addon_disable_boot.py (1 hunks)
  • tests/resolution/fixup/test_addon_execute_repair.py (1 hunks)
  • tests/resolution/fixup/test_addon_execute_start.py (1 hunks)
🔇 Additional comments (17)
tests/resolution/fixup/test_addon_disable_boot.py (1)

1-40: Consider adding test cases for edge scenarios.

The current test coverage could be enhanced by adding test cases for:

  1. Multiple concurrent issues and suggestions
  2. Invalid suggestion types
  3. Concurrent fixup operations

Let's verify if these scenarios are tested elsewhere:

supervisor/resolution/fixups/addon_disable_boot.py (1)

41-44: Verify integration with issue reporting system.

Let's ensure the BOOT_FAIL issue type is properly integrated with the issue reporting system.

✅ Verification successful

Integration with issue reporting system verified successfully

The BOOT_FAIL issue type is properly integrated:

  • Defined as a constant in supervisor/resolution/const.py
  • Used correctly in addon management (supervisor/addons/addon.py)
  • Handled by multiple fixup strategies (addon_disable_boot.py and addon_execute_start.py)
  • Well covered by tests including issue creation, suggestions, and resolution flows
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the BOOT_FAIL issue type integration

# Test 1: Check if BOOT_FAIL is properly defined in const.py
echo "Checking BOOT_FAIL definition..."
rg "BOOT_FAIL.*=.*" supervisor/resolution/const.py

# Test 2: Check for other handlers of BOOT_FAIL issues
echo "Checking BOOT_FAIL usage..."
rg "BOOT_FAIL" --type py

# Test 3: Check for related tests
echo "Checking test coverage..."
rg "BOOT_FAIL" tests/ --type py

Length of output: 4575

supervisor/resolution/fixups/addon_execute_start.py (3)

1-11: LGTM! Well-structured imports and setup.

The file has a clear purpose, proper imports, and follows logging best practices.


41-59: LGTM! Well-defined properties.

The properties are well-typed, concise, and correctly implement the fixup behavior. The auto=False setting is particularly important as addon starts should not be automated without user intervention.


1-59: Verify integration with the resolution system.

Let's verify that this fixup is properly integrated with the resolution system and that the BOOT_FAIL issue type is correctly defined.

✅ Verification successful

Integration with resolution system verified successfully

The code is properly integrated with the supervisor's resolution system:

  • The BOOT_FAIL issue type is correctly defined in the resolution constants
  • Comprehensive test coverage exists in test_addon_execute_start.py covering success and failure scenarios
  • The addon state monitoring is properly implemented in supervisor/addons/addon.py with appropriate state transitions
  • The fixup correctly handles addon boot failures by:
    • Attempting to start the addon
    • Monitoring state transitions
    • Raising appropriate errors for failed starts
    • Auto-fix is disabled by default as required
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if BOOT_FAIL is defined in resolution constants
echo "Checking for BOOT_FAIL definition:"
rg "BOOT_FAIL\s*=" supervisor/resolution/const.py

# Verify fixup registration
echo -e "\nChecking fixup registration:"
rg "addon_execute_start" supervisor/resolution/module.py

# Check for related tests
echo -e "\nChecking for related test coverage:"
fd "test_addon_execute_start.py" tests/

Length of output: 466


Script:

#!/bin/bash
# Check test coverage content
echo "Checking test content:"
cat tests/resolution/fixup/test_addon_execute_start.py

# Check resolution module registration
echo -e "\nChecking resolution module setup:"
cat supervisor/resolution/module.py

# Check if BOOT_FAIL is used in resolution checks
echo -e "\nChecking BOOT_FAIL usage in checks:"
rg "IssueType.BOOT_FAIL" supervisor/resolution/checks/

Length of output: 13857


Script:

#!/bin/bash
# Check for boot failure check implementation
echo "Checking for boot failure check implementation:"
rg "boot_fail" supervisor/resolution/checks/ -A 10

# Check fixup registration in ResolutionFixup
echo -e "\nChecking fixup registration:"
cat supervisor/resolution/fixup/__init__.py

# Look for addon boot state monitoring
echo -e "\nChecking addon boot state monitoring:"
rg "state.*=.*AddonState" supervisor/addons/ -A 5

Length of output: 4239

tests/resolution/fixup/test_addon_execute_repair.py (1)

Line range hint 1-16: LGTM! Well-organized imports and clear docstring.

The imports are properly organized and include all necessary dependencies for testing Docker interactions, mocking, and core functionality.

supervisor/resolution/const.py (2)

74-74: LGTM: New issue type for boot failures

The addition of BOOT_FAIL to IssueType aligns with the PR's objective of handling add-on boot failures and follows the established naming conventions.


107-107: LGTM: New suggestion types for boot failure resolution

The additions of DISABLE_BOOT and EXECUTE_START to SuggestionType provide appropriate resolution actions for boot failures:

  • DISABLE_BOOT: Allows disabling problematic add-on boot behavior
  • EXECUTE_START: Provides manual start capability as a recovery option

Both follow the established naming patterns and are alphabetically ordered correctly.

Also applies to: 115-115

tests/resolution/fixup/test_addon_execute_start.py (1)

75-99: Consider testing startup timing scenarios.

The test verifies state transitions but doesn't test timing-related scenarios. Consider adding test cases for:

  1. Slow startup scenarios
  2. Timeout conditions
  3. State oscillation during startup

Let's check if there are timeout-related constants or configurations that should be tested:

✅ Verification successful

Let me check the actual timeout implementation to provide a more accurate verification.


Let me check the actual timeout constant to provide complete verification.


Review comment is valid but needs refinement

The test should indeed cover timing scenarios as there is a 120-second startup timeout implemented in supervisor/addons/addon.py. Specific suggestions for test improvements:

  1. Test timeout scenario using asyncio.TimeoutError in the mock
  2. Test state transitions during the startup period
  3. Verify behavior when startup takes longer but succeeds within timeout

However, the current test structure using _wait_for_startup mock is appropriate for unit testing, as testing actual timeouts would make the test suite slow and potentially flaky. The suggestion should be refined to focus on mocking timing-related edge cases rather than actual delays.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for timeout-related configurations in addon startup
rg -l "wait.*startup|timeout.*start" --type py

Length of output: 210


Script:

#!/bin/bash
# Check timeout implementation in addon startup
rg "wait.*startup|timeout.*start" -B 2 -A 2 --type py

Length of output: 4324


Script:

#!/bin/bash
# Find the STARTUP_TIMEOUT constant
rg "STARTUP_TIMEOUT.*=" --type py -B 2 -A 2

Length of output: 288

supervisor/addons/manager.py (1)

118-124: LGTM! Verify boot_failed_issue property implementation.

The error handling and suggestions for failed add-on boots are well-implemented. The suggestions provide users with appropriate options to either retry the start or disable automatic boot.

Let's verify the implementation of the boot_failed_issue property:

✅ Verification successful

Property implementation verified and correctly used

The boot_failed_issue property is properly implemented in supervisor/addons/addon.py and is being used appropriately. The implementation:

  • Returns an Issue instance with IssueType.BOOT_FAIL and the addon's slug as reference
  • Is automatically dismissed when the addon starts successfully or when boot at start is disabled
  • Has corresponding fixup handlers in place for both EXECUTE_START and DISABLE_BOOT suggestions
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the boot_failed_issue property implementation
# Expected: Find the property definition and its usage

# Search for the boot_failed_issue property definition
ast-grep --pattern 'class Addon {
  $$$
  @property
  def boot_failed_issue($_) {
    $$$
  }
  $$$
}'

# Search for any tests covering this property
rg -l "boot_failed_issue.*test" tests/

Length of output: 145


Script:

#!/bin/bash
# Let's try a broader search for boot_failed_issue
rg "boot_failed_issue" -A 3 -B 3

# Also search for any Issue related to boot failure in the codebase
rg "IssueType.*BOOT" -A 2 -B 2

# Search for the Addon class definition to understand its structure
ast-grep --pattern 'class Addon:
  $$$'

Length of output: 6354

tests/addons/test_manager.py (2)

4-4: LGTM! Well-structured constants for boot failure scenarios.

The new imports and constants are properly organized, providing clear test data for boot failure scenarios. The constants follow the pattern of defining both issues and their corresponding suggestions, making the test cases more maintainable.

Also applies to: 29-30, 39-49


134-135: LGTM! Comprehensive test coverage for boot failure scenarios.

The test cases properly verify that appropriate issues and suggestions are added to the resolution system for different types of boot failures.

Let's verify that all boot failure scenarios in the codebase are covered by these tests:

Also applies to: 143-144, 156-157, 173-174

✅ Verification successful

Boot failure scenarios are well covered by tests

Based on the search results:

  • Boot failure handling code is present in multiple files including supervisor/resolution/fixups/addon_execute_start.py and supervisor/resolution/fixups/addon_disable_boot.py
  • Test coverage exists in corresponding test files like tests/resolution/fixup/test_addon_execute_start.py and tests/addons/test_addon.py
  • The assertions in test_manager.py properly verify both issues and suggestions using BOOT_FAIL_ISSUE and BOOT_FAIL_SUGGESTIONS constants
  • All boot failure scenarios have matching test coverage with appropriate resolution system checks
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any boot failure handling code that might not be covered by tests
# Expected: All boot failure handling code should have corresponding test coverage

# Search for boot failure related code
echo "Searching for boot failure handling code..."
rg -l "AddonBoot|boot_failed|BOOT_FAIL" --type python

# Search for related test coverage
echo -e "\nSearching for corresponding test coverage..."
rg -l "test.*boot.*fail|test.*fail.*boot" --type python tests/

Length of output: 384


Script:

#!/bin/bash
# Let's try without the --type flag since it's not supported
# First, let's find all Python files that handle boot failures
echo "Searching for boot failure handling code..."
rg -l "AddonBoot|boot_failed|BOOT_FAIL" "*.py"

# Then search for test coverage
echo -e "\nSearching for corresponding test coverage..."
rg -l "test.*boot.*fail|test.*fail.*boot" "tests/*.py"

# Let's also check for resolution-related assertions in test files
echo -e "\nSearching for resolution assertions in tests..."
rg "assert.*resolution\.(issues|suggestions)" "tests/*.py"

Length of output: 626


Script:

#!/bin/bash
# Let's first find all Python files recursively
echo "Finding Python files with boot failure handling..."
fd -e py -x rg -l "AddonBoot|boot_failed|BOOT_FAIL" {}

echo -e "\nFinding test files with boot failure tests..."
fd -e py -p "test.*\.py" -x rg -l "test.*boot.*fail|test.*fail.*boot" {}

echo -e "\nChecking resolution assertions in test files..."
fd -e py -p "test.*\.py" -x rg "assert.*resolution\.(issues|suggestions)" {}

Length of output: 14674

tests/addons/test_addon.py (2)

902-914: LGTM! Well-structured test for boot failure dismissal.

The test effectively verifies that a successful addon start clears both the boot failure issue and its associated suggestions. Good test isolation and clear assertions.


916-928: LGTM! Clear test coverage for boot mode transition.

The test properly verifies that changing the boot mode to manual dismisses the boot failure issue and its suggestions. Good setup of test conditions and state transitions.

supervisor/addons/addon.py (3)

84-85: LGTM: Required imports for issue handling.

The new imports are correctly placed and necessary for the boot failure issue functionality.


153-156: LGTM: Well-structured boot failure issue property.

The property correctly creates an issue with appropriate type and context, using the addon's slug as reference.


175-181: LGTM: Proper cleanup of boot failure issue.

The code correctly dismisses the boot failure issue when the addon successfully starts.

✅ Verification successful

Based on the test file analysis, I can now generate the final response since I have sufficient information to verify the review comment. The test file includes specific test cases that verify the boot failure issue dismissal behavior, confirming the implementation is correct.

LGTM: Implementation verified through test coverage

The code correctly dismisses the boot failure issue when the addon starts successfully, as verified by the test case test_addon_start_dismisses_boot_fail. The test confirms that:

  • The issue is properly dismissed when state changes to STARTED
  • Both issues and suggestions are cleared from resolution system
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the resolution system's issue dismissal is called correctly
# Test: Search for other places where boot failure issues are dismissed
rg "dismiss.*boot.*fail.*issue" --type py

Length of output: 419


Script:

#!/bin/bash
# Let's examine the test file to understand the test cases and verify the implementation
cat tests/addons/test_addon.py

Length of output: 31331

# Dismiss boot failed issue if present and we started
if (
new_state == AddonState.STARTED
and self.boot_failed_issue in self.sys_resolution.issues
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling boot_failed_issue will lead to a new object created, is there maybe a better way to check for this issue type?

Copy link
Contributor Author

@mdegat01 mdegat01 Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed the same pattern we had used in Mounts. But yea I guess I could cache the object for reuse? The only weird part of that would be the UUID would always be identical each time the property was accessed but I guess that's not really a big deal, we don't use that for anything other then referencing it in the API anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it does feel weird to me to create an object just to check if it's in there. But since we use that pattern already, let's go with that for now.

Maybe worth a follow up which does this a bit more efficient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok fixed it in both places. Issue is created on initialization and reused. When actually added to the resolution center we clone it first. I don't think the UUID sharing thing is an issue but just in case. That's the only time we kind of want a new object.

@mdegat01 mdegat01 merged commit e1e5d3a into main Nov 7, 2024
20 checks passed
@mdegat01 mdegat01 deleted the add-repair-addon-boot-fail branch November 7, 2024 18:39
@github-actions github-actions bot locked and limited conversation to collaborators Nov 9, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants