refactor: downloaders rework #903

davidemarcoli · 2024-11-22T17:14:08Z

Pull Request Check List

Resolves: #issue-number-here

Added tests for changed code.
Updated documentation for changed code.

Description:

Summary by CodeRabbit

New Features
- Enhanced job monitoring for service execution times, with logging for jobs exceeding three minutes.
- New method for extracting info hash from magnet URIs, improving session management.
- Added classes and methods for improved handling of video downloads and torrent management.
- Introduced a timeout mechanism for request handling to enhance error management.
Bug Fixes
- Improved error handling for session management and torrent availability checks.
- Enhanced clarity of validation error messages in settings management.
Documentation
- Updated logging and error messages for better traceability and user feedback.
Refactor
- Simplified type assignment in media classes, removing unnecessary enum abstractions.
- Streamlined downloader interfaces and improved request handling logic.
Chores
- Removed unused imports and cleaned up codebase for better maintainability.

…oader

coderabbitai · 2024-11-22T17:14:16Z

Walkthrough

This pull request introduces significant changes across multiple files, primarily focusing on enhancing the functionality and error handling of various components, including the EventManager, downloaders, and scrapers. Key modifications include the addition of monitoring logic for job execution times, the refactoring of downloader methods to streamline operations, and improvements to session management in the scraping process. Several classes and methods have been added or modified to improve clarity and maintainability, while some redundant elements have been removed to simplify the codebase.

Changes

File Path	Change Summary
`src/program/managers/event_manager.py`	Added `sys` and `time` imports; modified `submit_job` method to include monitoring logic for job execution time; added `cancellation_event` for job management; original `submit_job` implementation is commented out.
`src/program/media/__init__.py`	Removed imports for `ShowMediaType`, `MovieMediaType`, and `MediaType`, reducing the number of imported entities.
`src/program/media/item.py`	Removed `ShowMediaType`, `MovieMediaType`, and `MediaType` enums; replaced enum values with string literals in `Movie`, `Show`, `Season`, and `Episode` classes; updated `__init__` methods accordingly.
`src/program/program.py`	Commented out scheduling of `fix_broken_symlinks` function and added a warning log indicating symlink repair is disabled.
`src/program/services/downloaders/__init__.py`	Removed `InvalidFileSizeException` and `DownloadCachedStreamResult` classes; refactored `run` method; added `validate_stream`, `match_file_to_item`, and several new methods for improved downloading and error handling.
`src/program/services/downloaders/alldebrid.py`	Refactored `get_instant_availability` to handle a single `infohash` and `item_type`; added `get_files_and_links` method; updated `get_torrent_info` return type to `TorrentInfo`.
`src/program/services/downloaders/models.py`	Introduced classes for handling video downloads and torrent management, including `DebridFile`, `ParsedFileData`, `TorrentContainer`, `TorrentInfo`, and `DownloadedTorrent`, along with relevant attributes and methods.
`src/program/services/downloaders/realdebrid.py`	Removed retry logic from `get_instant_availability`; updated method signatures to accept single `infohash` and return `TorrentInfo`.
`src/program/services/downloaders/shared.py`	Updated method signatures in `DownloaderBase` class for clarity and type safety; added `parse_filename` function.
`src/program/services/downloaders/torbox.py`	Overhauled `TorBoxDownloader` class; introduced `TorBoxAPI` and `TorBoxRequestHandler` for improved API interaction; added several new methods for torrent handling.
`src/program/services/scrapers/mediafusion.py`	Modified title extraction logic in `scrape` method; maintained error handling in `run` method.
`src/program/services/scrapers/shared.py`	Initialized `bucket_limit` from settings; updated torrent sorting logic to include `bucket_limit`.
`src/program/settings/manager.py`	Enhanced error handling in `load` method by introducing `format_validation_error` function for user-friendly validation error messages.
`src/program/settings/models.py`	Added `proxy_enabled`, `proxy_url`, and `bucket_limit` attributes to `TorboxModel` and `ScraperModel` classes.
`src/program/types.py`	Updated type hint for `Downloader` to include `TorBoxDownloader` and exclude `AllDebridDownloader`.
`src/program/utils/request.py`	Added `timeout` attribute in `BaseRequestHandler`; refined error handling in `ResponseObject` class, specifically for rate-limiting scenarios.
`src/routers/secure/scrape.py`	Added `get_info_hash` function for extracting info hashes from magnet URIs; updated session management logic for improved error handling.
`src/program/services/libraries/symlink.py`	No changes to the declarations of exported or public entities were made.
`src/program/state_transition.py`	Updated logic in `process_event` function to handle new conditions for media event processing.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant EventManager
    participant JobMonitor
    participant Downloader

    User->>EventManager: submit_job(service, program)
    EventManager->>JobMonitor: start monitoring job execution
    JobMonitor->>EventManager: check execution time
    alt exceeds 3 minutes
        EventManager->>User: log warning and cancel job
    else within time
        EventManager->>Downloader: process download
    end

Possibly related PRs

feat: integrate dependency injection with kink library #859: Introduces changes to the EventManager class, potentially enhancing API instance management and service initialization.
feat: implement filesize validation for movies and episodes #869: Involves job submission and monitoring, related to the file size validation logic, focusing on task management.
feat: add denied reasoning when trashing torrents and added adult parsing #888: Focuses on providing reasoning for denied actions when trashing torrents, related to job management and error handling improvements.

Suggested reviewers

Gaisberg

Poem

🐰 In the burrow where code does weave,
Changes sprout like autumn leaves.
Jobs now monitored, swift and bright,
Download streams take flight at night.
With clearer paths and logs that cheer,
The rabbit hops, "Let's code, my dear!" 🐇

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

dreulavelle · 2024-11-22T18:11:59Z

src/program/services/downloaders/torbox.py

+    COMPRESSING = "compressing"
+
+class TBTorrent(BaseModel):
+    """Real-Debrid torrent model"""


Uuuups 😆

dreulavelle · 2024-11-24T04:56:23Z

still working on it but its coming along pretty well.. ran into a snag though..

24-11-23 23:49:47 | ❌ ERROR     | torbox.get_torrent_info - Failed to get torrent info for 182010: Request failed: 404 Client Error: Not Found for url: https://api.torbox.app/v1/api/torrents/torrentinfo/182010
24-11-23 23:49:47 | 🐞 DEBUG     | __init__.run - Invalid stream: 8b0ca81c975d347fb921b757c2dc593f01a5f794 - reason: Request failed: 404 Client Error: Not Found for url: https://api.torbox.app/v1/api/torrents/torrentinfo/182010

looks like the get torrent info endpoint isnt right for Torbox.. @davidemarcoli

I'll work on it more when I wake up.. been at it for hours already lol

Gaisberg

LGTM!

coderabbitai

Actionable comments posted: 22

🧹 Outside diff range and nitpick comments (33)

src/program/types.py (1)

Line range hint 1-43: Document type changes and add integration tests.

Since this change affects the core types:

Add docstring documentation for the Downloader type hint explaining the available implementations and their use cases

Add integration tests to verify the behavior with different downloader implementations

Would you like me to help create:

A documentation template for the type hints?

Integration test examples for the downloader implementations?
src/program/settings/manager.py (2)
72-73: Consider logging at debug level before raising

Since the error is being raised, it will likely be caught and logged by an upper layer. Consider logging at debug level here to avoid duplicate error messages in logs while maintaining the detailed information for debugging.
-            formatted_error = format_validation_error(e)
-            logger.error(f"Settings validation failed:\n{formatted_error}")
+            formatted_error = format_validation_error(e)
+            logger.debug(f"Settings validation details:\n{formatted_error}")
             raise
89-96: Enhance type safety and documentation

The function could benefit from improved type hints and documentation.
-def format_validation_error(e: ValidationError) -> str:
-    """Format validation errors in a user-friendly way"""
+def format_validation_error(e: ValidationError) -> str:
+    """Format validation errors in a user-friendly way.
+    
+    Args:
+        e: The ValidationError instance containing the errors
+        
+    Returns:
+        A formatted string where each error is prefixed with a bullet point (•)
+        and shows the field location using dot notation, followed by the error message.
+        Example:
+        • field.nested: value is not a valid integer
+        • other_field: field required
+    """
     messages = []
     for error in e.errors():
-        field = ".".join(str(x) for x in error["loc"])
+        field = ".".join(str(x) for x in error["loc"]) if error.get("loc") else "unknown"
         message = error.get("msg")
         messages.append(f"• {field}: {message}")
     return "\n".join(messages)
src/program/services/scrapers/mediafusion.py (1)
148-152: Consider adding input validation for stream descriptions.

The code assumes that the stream description will always contain valid content that can be split. Consider adding validation to handle cases where the description might be empty or malformed.
-            description_split = stream.description.replace("📂 ", "")
-            raw_title = description_split.split("\n")[0]
+            description_split = stream.description.replace("📂 ", "") if stream.description else ""
+            raw_title = description_split.split("\n")[0] if description_split else ""
+            if not raw_title:
+                logger.debug(f"Empty or invalid description for stream: {stream.url}")
+                continue
src/program/services/scrapers/shared.py (2)
21-25: Consider moving settings initialization into a function.

Module-level initialization of settings can cause issues with testing and reloading. Consider encapsulating these settings in a function or class method.
-bucket_limit = settings_manager.settings.scraping.bucket_limit or 5
-enable_aliases = settings_manager.settings.scraping.enable_aliases
-ranking_settings = settings_manager.settings.ranking
-ranking_model = models.get(ranking_settings.profile)
-rtn = RTN(ranking_settings, ranking_model)

+def initialize_settings():
+    bucket_limit = getattr(settings_manager.settings.scraping, 'bucket_limit', 5)
+    enable_aliases = settings_manager.settings.scraping.enable_aliases
+    ranking_settings = settings_manager.settings.ranking
+    ranking_model = models.get(ranking_settings.profile)
+    return bucket_limit, enable_aliases, RTN(ranking_settings, ranking_model)
+
+bucket_limit, enable_aliases, rtn = initialize_settings()
Tests needed for scraper functionality and bucket limit feature

Based on the repository analysis:

No existing tests for the scraper module or bucket_limit functionality were found

Project has an established testing framework using pytest

Testing guidelines are documented in CONTRIBUTING.md

Required test coverage:

src/program/services/scrapers/shared.py:

Test _parse_results function with bucket_limit scenarios

Test ScraperRequestHandler class

Test _get_stremio_identifier function

Please follow the existing test patterns in src/tests/ and use pytest fixtures. Add the tests in a new file src/tests/test_scraper.py.

🔗 Analysis chain

Line range hint 1-141: Add tests and documentation for the changes.

The PR checklist indicates that tests and documentation are incomplete. Please ensure:

Add unit tests for the new bucket_limit functionality

Document the bucket_limit feature in the project documentation

Consider adding integration tests with the TorBox API given the reported issues

Let's check for existing tests:

Would you like help with:

Generating unit tests for the bucket_limit functionality?

Creating documentation for the new feature?

Setting up integration tests with the TorBox API?

Let's also check for any documentation about scrapers and bucket limits:

Let's also verify if there are any existing tests that might be using the bucket_limit functionality:
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Look for test files related to scrapers
fd -e py test_.*scraper
Length of output: 27

Script:
#!/bin/bash
# Let's try a broader search for test files and scraper-related tests
fd test -t f | grep -i scraper

# Also search for any Python test files
fd -e py "test_" 

# Let's also check if there are any tests in a tests directory
find . -type f -path "*/tests/*" -name "*.py"
Length of output: 837

Script:
#!/bin/bash
# Search for documentation files
fd -e md -e rst | grep -i "doc\|docs"

# Search for mentions of bucket_limit in documentation
rg "bucket_limit" -t md -t rst
Length of output: 70

Script:
#!/bin/bash
# Search for bucket_limit usage in test files
rg "bucket_limit" -g "test_*.py"
Length of output: 32

Script:
#!/bin/bash
# Let's check test_ranking.py since bucket_limit is used in sort_torrents
cat src/tests/test_ranking.py

# Let's also check if there are any existing tests for the shared.py module
rg -l "shared\.py" src/tests/

# Check if there's any documentation in the root directory or common doc locations
fd README -t f
fd CONTRIBUTING -t f
Length of output: 905

Script:
#!/bin/bash
# Let's check the content of README and CONTRIBUTING files for any testing guidelines
cat README.md
cat CONTRIBUTING.md

# Let's also check if there are any test utilities or fixtures that might be relevant
rg -l "pytest" src/tests/
Length of output: 13623
src/program/settings/models.py (1)

60-61: Consider standardizing validation patterns across configuration models.

The codebase shows various validation approaches:

Some fields use Field with constraints (like the new bucket_limit)

Some use field_validator (like update_interval)

Some have no validation (like the new proxy settings)

Consider standardizing the validation approach across similar types of configuration fields for better maintainability.

Also applies to: 286-286
src/routers/secure/scrape.py (3)
249-251: Improve error message specificity

The error message could be more helpful by indicating what makes the magnet URI invalid.
     info_hash = get_info_hash(magnet)
     if not info_hash:
-        raise HTTPException(status_code=400, detail="Invalid magnet URI")
+        raise HTTPException(
+            status_code=400,
+            detail="Invalid magnet URI: Must start with 'magnet:?' and contain a valid 40-character info hash"
+        )
243-251: Consider adding security measures

The current implementation could benefit from additional security measures:

Rate limiting on the API endpoints to prevent DoS attacks

Validation of torrent size limits to prevent resource exhaustion

Content validation or scanning capabilities

Consider implementing:

FastAPI's built-in rate limiting using something like:
from fastapi import Depends
from slowapi import Limiter
from slowapi.util import get_remote_address

limiter = Limiter(key_func=get_remote_address)
@router.post("/scrape/start_session", dependencies=[Depends(limiter.limit("5/minute"))])
Torrent size validation before processing:
MAX_TORRENT_SIZE = 100 * 1024 * 1024 * 1024  # 100GB example limit

if torrent_info["size"] > MAX_TORRENT_SIZE:
    raise HTTPException(status_code=400, detail="Torrent size exceeds maximum allowed limit")
243-251: Address pending PR requirements

According to the PR description, the following items are still pending:

Tests for the changed code

Documentation updates

Please ensure:

Add unit tests for the new get_info_hash function

Update API documentation with the new error scenarios

Would you like me to help with generating:

Unit tests for the get_info_hash function?

OpenAPI documentation updates for the error scenarios?
src/program/utils/request.py (1)
Line range hint 39-77: Enhance error handling structure and reusability

The error handling implementation could be improved in several ways:

Move status code lists to class-level constants for reusability

Enhance error messages with more context

Consider splitting error handling into a separate method

Consider refactoring like this:
 class ResponseObject:
+    # Status code constants
+    TIMEOUT_STATUSES = [408, 460, 504, 520, 524, 522, 598, 599]
+    RATE_LIMIT_STATUSES = [429]
+    CLIENT_ERROR_STATUSES = list(range(400, 451))  # 400-450
+    SERVER_ERROR_STATUSES = list(range(500, 512))  # 500-511
+
+    def _handle_error_status(self, response: Response) -> None:
+        """Handle error status codes with appropriate exceptions."""
+        if self.status_code in self.TIMEOUT_STATUSES:
+            raise ConnectTimeout(
+                f"Connection timed out with status {self.status_code}. URL: {response.url}",
+                response=response
+            )
+        if self.status_code in self.RATE_LIMIT_STATUSES:
+            raise RateLimitExceeded(
+                f"Rate Limit Exceeded {self.status_code}. URL: {response.url}",
+                response=response
+            )
+        if self.status_code in self.CLIENT_ERROR_STATUSES:
+            raise RequestException(
+                f"Client error with status {self.status_code}. URL: {response.url}",
+                response=response
+            )
+        if self.status_code in self.SERVER_ERROR_STATUSES:
+            raise RequestException(
+                f"Server error with status {self.status_code}. URL: {response.url}",
+                response=response
+            )
+        if not self.is_ok:
+            raise RequestException(
+                f"Request failed with status {self.status_code}. URL: {response.url}",
+                response=response
+            )

     def handle_response(self, response: Response, response_type: ResponseType) -> dict | SimpleNamespace:
-        timeout_statuses = [408, 460, 504, 520, 524, 522, 598, 599]
-        rate_limit_statuses = [429]
-        client_error_statuses = list(range(400, 451))  # 400-450
-        server_error_statuses = list(range(500, 512))  # 500-511
-
-        if self.status_code in timeout_statuses:
-            raise ConnectTimeout(f"Connection timed out with status {self.status_code}", response=response)
-        if self.status_code in rate_limit_statuses:
-            raise RateLimitExceeded(f"Rate Limit Exceeded {self.status_code}", response=response)
-        if self.status_code in client_error_statuses:
-            raise RequestException(f"Client error with status {self.status_code}", response=response)
-        if self.status_code in server_error_statuses:
-            raise RequestException(f"Server error with status {self.status_code}", response=response)
-        if not self.is_ok:
-            raise RequestException(f"Request failed with status {self.status_code}", response=response)
+        self._handle_error_status(response)
src/program/media/item.py (4)
413-413: Consider using constants and type validation for media types.

While using string literals simplifies the code, it makes it more prone to typos and harder to refactor. Consider:

Defining constants for media types

Adding type validation in the constructor
+ # At the top of the file
+ MEDIA_TYPE_MOVIE = "movie"
+ MEDIA_TYPE_SHOW = "show"
+ MEDIA_TYPE_SEASON = "season"
+ MEDIA_TYPE_EPISODE = "episode"
+ 
+ VALID_MEDIA_TYPES = {MEDIA_TYPE_MOVIE, MEDIA_TYPE_SHOW, MEDIA_TYPE_SEASON, MEDIA_TYPE_EPISODE}
+
+ def validate_media_type(type_value: str) -> None:
+     if type_value not in VALID_MEDIA_TYPES:
+         raise ValueError(f"Invalid media type: {type_value}")

  def __init__(self, item):
-     self.type = "movie"
+     self.type = MEDIA_TYPE_MOVIE
+     validate_media_type(self.type)
Line range hint 550-554: Enhance type safety in Season constructor.

While the parent type check is good, we can improve type safety by:

Using type constants

Adding type hints

Making the parent type check more explicit
-     def __init__(self, item):
+     def __init__(self, item: dict):
          super().__init__(item)
-         self.type = "season"
+         self.type = MEDIA_TYPE_SEASON
+         validate_media_type(self.type)
          self.number = item.get("number", None)
          self.episodes: list[Episode] = item.get("episodes", [])
-         if self.parent and isinstance(self.parent, Show):
+         if not self.parent:
+             logger.warning(f"Season {self.number} has no parent show")
+         elif not isinstance(self.parent, Show):
+             raise TypeError(f"Season {self.number} parent must be a Show, not {type(self.parent)}")
+         else:
              self.is_anime = self.parent.is_anime
Line range hint 649-654: Maintain consistent type handling across classes.

For consistency with other media types and improved type safety:

Use type constants

Add type hints

Make parent type checking consistent
-     def __init__(self, item):
+     def __init__(self, item: dict):
          super().__init__(item)
-         self.type = "episode"
+         self.type = MEDIA_TYPE_EPISODE
+         validate_media_type(self.type)
          self.number = item.get("number", None)
          self.file = item.get("file", None)
-         if self.parent and isinstance(self.parent, Season):
+         if not self.parent:
+             logger.warning(f"Episode {self.number} has no parent season")
+         elif not isinstance(self.parent, Season):
+             raise TypeError(f"Episode {self.number} parent must be a Season, not {type(self.parent)}")
+         else:
              self.is_anime = self.parent.parent.is_anime
Line range hint 413-654: Consider a more robust type system for media types.

While removing enums simplifies the code, the current approach using string literals across multiple classes could lead to maintenance issues. Consider implementing a more robust type system:

Use a TypeVar or Protocol for media types

Implement a MediaType class with validation

Add runtime type checking decorators

This would provide better type safety while maintaining code simplicity.
src/program/services/downloaders/shared.py (1)
3-3: Remove unused import List

The import List from the typing module is not used in this file. Removing it will clean up the code.

Apply this diff to remove the unused import:
-from typing import List, Optional
+from typing import Optional
🧰 Tools

🪛 Ruff (0.8.0)

3-3: typing.List imported but unused

Remove unused import: typing.List

(F401)
src/program/services/downloaders/models.py (6)
3-3: Remove unused import 'logger'

The logger imported from the loguru module is not used in the code. Removing unused imports helps keep the code clean and enhances readability.

Apply this diff to remove the unused import:
-from loguru import logger
🧰 Tools

🪛 Ruff (0.8.0)

3-3: loguru.logger imported but unused

Remove unused import: loguru.logger

(F401)

51-51: Use 'not in' for membership test

Per Python style guidelines, when testing if an item is not in a sequence, use 'not in' instead of 'not ... in'. The expression 'not "sample" in filename.lower()' should be rewritten as '"sample" not in filename.lower()' for better readability.

Apply this diff to correct the syntax:
-if not any(filename.endswith(ext) for ext in VIDEO_EXTENSIONS) or not "sample" in filename.lower():
+if not any(filename.endswith(ext) for ext in VIDEO_EXTENSIONS) or "sample" in filename.lower():
     return None
This adjustment ensures adherence to best practices in Python code.

🧰 Tools

🪛 Ruff (0.8.0)

51-51: Test for membership should be not in

Convert to not in

(E713)

55-57: Simplify nested 'if' statements

The nested if statements can be combined into a single if statement for better readability.

Apply this diff to simplify the code:
-if filetype == "movie":
-    if not (FILESIZE_MOVIE_CONSTRAINT[0] <= filesize_mb <= FILESIZE_MOVIE_CONSTRAINT[1]):
-        return None
+if filetype == "movie" and not (FILESIZE_MOVIE_CONSTRAINT[0] <= filesize_mb <= FILESIZE_MOVIE_CONSTRAINT[1]):
+    return None
This change combines the conditions into a single if statement, improving readability and reducing code indentation.

58-60: Simplify nested 'if' statements

Similarly, you can simplify the nested if statements for the 'episode' file type.

Apply this diff to simplify the code:
-elif filetype == "episode":
-    if not (FILESIZE_EPISODE_CONSTRAINT[0] <= filesize_mb <= FILESIZE_EPISODE_CONSTRAINT[1]):
-        return None
+elif filetype == "episode" and not (FILESIZE_EPISODE_CONSTRAINT[0] <= filesize_mb <= FILESIZE_EPISODE_CONSTRAINT[1]):
+    return None
This makes the logic clearer and the code more concise.

🧰 Tools

🪛 Ruff (0.8.0)

58-59: Use a single if statement instead of nested if statements

(SIM102)

90-90: Avoid using built-in names as attribute names

The attribute id shadows the built-in Python function id(). While this doesn't cause an error, it can lead to confusion or unexpected behavior. Consider renaming the attribute to something more descriptive, like torrent_id or identifier, for clarity.

Apply this diff to rename the attribute:
 class TorrentInfo(BaseModel):
     """Torrent information from a debrid service"""
-    id: Union[int, str]
+    torrent_id: Union[int, str]
     name: str
     status: str = Field(default=None)
     # ... rest of the class ...
Remember to update all references to id within this class and wherever it's used.

110-110: Avoid using built-in names as attribute names

Similarly, in the DownloadedTorrent class, the attribute id can be renamed to prevent shadowing the built-in id() function.

Apply this diff to rename the attribute:
 class DownloadedTorrent(BaseModel):
     """Represents the result of a download operation"""
-    id: Union[int, str]
+    download_id: Union[int, str]
     infohash: str
     container: TorrentContainer
     info: TorrentInfo
Update all references to id accordingly to maintain consistency.
src/program/services/downloaders/__init__.py (3)
125-126: Refactor show retrieval logic for clarity and safety

The current logic for obtaining the show object from item is complex and may lead to errors if the hierarchy is not as expected.

Refactor the code by creating a helper method to retrieve the show object, enhancing readability and reducing the risk of AttributeError.

Example:
def get_show_from_item(item: MediaItem) -> Show:
    if item.type == 'show':
        return item
    elif item.type == 'season':
        return item.parent
    elif item.type == 'episode':
        return item.parent.parent
    else:
        raise ValueError(f"Unsupported item type: {item.type}")

# Then use:
show = get_show_from_item(item)
116-132: Simplify match_file_to_item by removing unnecessary variable

The found variable in the match_file_to_item method is unnecessary since the method returns immediately upon finding a match.

Simplify the method by eliminating the found variable and directly returning True or False as appropriate.

Apply this diff to improve readability:
 def match_file_to_item(self, item: MediaItem, file_data: ParsedFileData, file: DebridFile, download_result: DownloadedTorrent) -> bool:
     """Check if the file matches the item and update attributes."""
-    found = False
     if item.type == "movie" and file_data.item_type == "movie":
         self._update_attributes(item, file, download_result)
         return True

     if item.type in ("show", "season", "episode"):
         if not (file_data.season and file_data.episodes):
             return False

         show: Show = get_show_from_item(item)
         season: Season = next((season for season in show.seasons if season.number == file_data.season), None)
         for file_episode in file_data.episodes:
             episode: Episode = next((episode for episode in season.episodes if episode.number == file_episode), None)
             if episode and episode.state not in [States.Completed, States.Symlinked, States.Downloaded]:
                 self._update_attributes(episode, file, download_result)
                 return True

-    return found
+    return False
162-165: Rename parameter container to file_ids in select_files method

The parameter container in the select_files method represents a list of file IDs. Using file_ids as the parameter name enhances clarity.

Apply this diff to improve code readability:
-def select_files(self, torrent_id: int, container: list[str]) -> None:
+def select_files(self, torrent_id: int, file_ids: list[str]) -> None:
     """Select files from a torrent"""
-    self.service.select_files(torrent_id, container)
+    self.service.select_files(torrent_id, file_ids)
Ensure all calls to select_files are updated accordingly, such as in the download_cached_stream method:
 if container.file_ids:
-    self.select_files(torrent_id, container.file_ids)
+    self.select_files(torrent_id, container.file_ids)
src/program/services/downloaders/torbox.py (6)
3-3: Remove unused import List from the typing module

The imported List from the typing module is not used in the code. Removing it will clean up unnecessary imports.

Apply this diff to remove the unused import:
-from typing import List, Optional, Union
+from typing import Optional, Union
🧰 Tools

🪛 Ruff (0.8.0)

3-3: typing.List imported but unused

Remove unused import: typing.List

(F401)

188-188: Remove unnecessary f-string prefix in the URL

The string f"torrents/controltorrent" does not contain any placeholder expressions, so the f prefix is unnecessary.

Apply this diff to remove the extraneous f prefix:
-                self.api.request_handler.execute(HttpMethod.POST, f"torrents/controltorrent", json={"torrent_id": str(torrent_id), "operation": "delete"}, timeout=15)
+                self.api.request_handler.execute(HttpMethod.POST, "torrents/controltorrent", json={"torrent_id": str(torrent_id), "operation": "delete"}, timeout=15)
🧰 Tools

🪛 Ruff (0.8.0)

188-188: f-string without any placeholders

Remove extraneous f prefix

(F541)

185-188: Ensure consistent data types for torrent_id

The torrent_id parameter is typed as int, but it is converted to a string before use in the API request. For consistency and to avoid unnecessary conversions, consider changing the parameter type to str.

Update the method signature and remove the unnecessary str() conversion:
-def delete_torrent(self, torrent_id: int) -> None:
+def delete_torrent(self, torrent_id: str) -> None:
     """Delete a torrent"""
     try:
-        self.api.request_handler.execute(HttpMethod.POST, "torrents/controltorrent", json={"torrent_id": str(torrent_id), "operation": "delete"}, timeout=15)
+        self.api.request_handler.execute(HttpMethod.POST, "torrents/controltorrent", json={"torrent_id": torrent_id, "operation": "delete"}, timeout=15)
🧰 Tools

🪛 Ruff (0.8.0)

188-188: f-string without any placeholders

Remove extraneous f prefix

(F541)

166-169: Implement the select_files method or raise NotImplementedError

The select_files method is currently empty. If this method is required by the DownloaderBase but not applicable for TorBoxDownloader, it's better to raise a NotImplementedError to indicate that it's intentionally unimplemented.

Update the method to explicitly indicate it's not implemented:
 def select_files(self, *args) -> None:
     """Select files from a torrent"""
-    pass
+    raise NotImplementedError("Method not implemented for TorBoxDownloader")
153-164: Handle potential exceptions more specifically in add_torrent

Catching all exceptions may obscure the actual issue. Consider catching specific exceptions that can occur during the execution of add_torrent.

Modify the exception handling to catch specific exceptions:
-        except Exception as e:
-            logger.error(f"Failed to add torrent {infohash}: {e}")
-            raise
+        except requests.exceptions.RequestException as e:
+            logger.error(f"Network error while adding torrent {infohash}: {e}")
+            raise
+        except KeyError as e:
+            logger.error(f"Unexpected response format when adding torrent {infohash}: missing key {e}")
+            raise
170-184: Improve exception handling in get_torrent_info

Similar to other methods, catching all exceptions can make debugging difficult. Catch specific exceptions to improve error clarity.

Adjust the exception handling:
         try:
             data = self.api.request_handler.execute(HttpMethod.GET, f"torrents/mylist?id={torrent_id}", timeout=15)['data']
             return TorrentInfo(
                 id=data["id"],
                 name=data["name"].split("/")[-1],  # points to dir
                 infohash=data["hash"],
                 status=data["download_state"],
                 bytes=data["size"]
             )
-        except Exception as e:
-            logger.error(f"Failed to get torrent info for {torrent_id}: {e}")
-            raise
+        except requests.exceptions.RequestException as e:
+            logger.error(f"Network error while retrieving info for torrent {torrent_id}: {e}")
+            raise
+        except KeyError as e:
+            logger.error(f"Unexpected response format for torrent {torrent_id}: missing key {e}")
+            raise
src/program/services/downloaders/alldebrid.py (1)
18-18: Remove unused import VIDEO_EXTENSIONS

The VIDEO_EXTENSIONS import is not used in the code. Removing unused imports improves code cleanliness and maintainability.

Apply this diff to remove the unused import:
-from program.services.downloaders.models import VIDEO_EXTENSIONS, DebridFile, TorrentContainer, TorrentInfo
+from program.services.downloaders.models import DebridFile, TorrentContainer, TorrentInfo
🧰 Tools

🪛 Ruff (0.8.0)

18-18: program.services.downloaders.models.VIDEO_EXTENSIONS imported but unused

Remove unused import: program.services.downloaders.models.VIDEO_EXTENSIONS

(F401)
src/program/services/downloaders/realdebrid.py (1)
205-213: Replace fixed delay with a dynamic polling mechanism

Using time.sleep(1) introduces a fixed delay, which might be insufficient or excessive depending on the context. Consider implementing a polling mechanism to check the torrent's status until it reaches the desired state. This approach is more robust and adapts to varying processing times.

Here's an example of how you might refactor the method:
 def select_files(self, torrent_id: str, ids: List[int] = None) -> None:
     """Select files from a torrent"""
     try:
         selection = ",".join(str(file_id) for file_id in ids) if ids else "all"
         self.api.request_handler.execute(
             HttpMethod.POST,
             f"torrents/selectFiles/{torrent_id}",
             data={"files": selection}
         )
-        time.sleep(1)
+        # Poll the torrent status until it's no longer 'waiting_files_selection'
+        while True:
+            torrent_info = self.get_torrent_info(torrent_id)
+            if torrent_info.status != "waiting_files_selection":
+                break
+            time.sleep(0.5)
     except Exception as e:
         logger.error(f"Failed to select files for torrent {torrent_id}: {e}")
         raise

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 4c68cfb and d25c01d.

⛔ Files ignored due to path filters (1)

poetry.lock is excluded by !**/*.lock

📒 Files selected for processing (17)

src/program/managers/event_manager.py (2 hunks)
src/program/media/__init__.py (1 hunks)
src/program/media/item.py (4 hunks)
src/program/program.py (1 hunks)
src/program/services/downloaders/__init__.py (2 hunks)
src/program/services/downloaders/alldebrid.py (8 hunks)
src/program/services/downloaders/models.py (1 hunks)
src/program/services/downloaders/realdebrid.py (4 hunks)
src/program/services/downloaders/shared.py (1 hunks)
src/program/services/downloaders/torbox.py (1 hunks)
src/program/services/scrapers/mediafusion.py (1 hunks)
src/program/services/scrapers/shared.py (2 hunks)
src/program/settings/manager.py (2 hunks)
src/program/settings/models.py (2 hunks)
src/program/types.py (2 hunks)
src/program/utils/request.py (2 hunks)
src/routers/secure/scrape.py (2 hunks)

✅ Files skipped from review due to trivial changes (1)

src/program/media/init.py

🧰 Additional context used

🪛 Ruff (0.8.0)

src/program/managers/event_manager.py

2-2: sys imported but unused

Remove unused import: sys

(F401)

4-4: time imported but unused

Remove unused import: time

(F401)

src/program/services/downloaders/alldebrid.py

18-18: program.services.downloaders.models.VIDEO_EXTENSIONS imported but unused

Remove unused import: program.services.downloaders.models.VIDEO_EXTENSIONS

(F401)

141-141: return inside finally blocks cause exceptions to be silenced

(B012)

src/program/services/downloaders/models.py

3-3: loguru.logger imported but unused

Remove unused import: loguru.logger

(F401)

51-51: Test for membership should be not in

Convert to not in

(E713)

58-59: Use a single if statement instead of nested if statements

(SIM102)

src/program/services/downloaders/shared.py

3-3: typing.List imported but unused

Remove unused import: typing.List

(F401)

src/program/services/downloaders/torbox.py

3-3: typing.List imported but unused

Remove unused import: typing.List

(F401)

188-188: f-string without any placeholders

Remove extraneous f prefix

(F541)

🔇 Additional comments (11)

src/program/types.py (2)

13-17: ⚠️ Potential issue

Clean up import statements and verify TorBox integration readiness.

Remove the duplicate commented line # TorBoxDownloader, as it appears to be unintentionally left in the code.
Given the reported TorBox API issues in the PR comments (404 errors), should we enable TorBoxDownloader at this stage?

from program.services.downloaders import (
    # AllDebridDownloader,
    RealDebridDownloader,
    TorBoxDownloader
)

-# TorBoxDownloader,

38-42: Verify type hint changes across the codebase.

The type hint modification looks clean, but let's ensure all code using the Downloader type is updated to handle TorBoxDownloader appropriately.

Run this script to find potential usage that needs updating:

✅ Verification successful

The previous results show usage of the Downloader type, but we need to verify if TorBoxDownloader is properly implemented and compatible with the existing downloader interface. This will help us determine if the type hint change is safe and properly supported.

Type hint changes are properly implemented and supported

The verification shows that TorBoxDownloader is fully implemented and follows the correct architecture:

It inherits from DownloaderBase, ensuring compatibility with the Downloader interface
It's properly registered in the services dictionary alongside other downloaders
The implementation in src/program/services/downloaders/torbox.py is complete with error handling and API integration

The type hint modification in types.py is safe and consistent with the codebase implementation.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find all usages of Downloader type and potential incompatibilities

# Find direct references to Downloader type
echo "=== Direct Downloader type usage ==="
rg "Downloader[\[|\]|\s|,|)]" --type py

# Find potential service instantiations
echo -e "\n=== Downloader service instantiations ==="
ast-grep --pattern 'class $CLASS($BASE) {
  $$$
}' | rg -A 2 "Downloader"

Length of output: 2477

Script:

#!/bin/bash
# Let's verify the TorBoxDownloader implementation and its compatibility

# Check TorBoxDownloader implementation
echo "=== TorBoxDownloader Implementation ==="
rg -l "class TorBoxDownloader" --type py

# Check if TorBoxDownloader inherits from the base Downloader class
echo -e "\n=== TorBoxDownloader Class Definition ==="
ast-grep --pattern 'class TorBoxDownloader($_) {
  $$$
}'

# Check for any TorBox specific handling
echo -e "\n=== TorBox Related Code ==="
rg "TorBox" --type py -C 2

Length of output: 10565

src/program/settings/manager.py (1)

89-97: Verify settings validation for downloader configurations

Given the PR's focus on downloader refactoring and API issues, ensure that the settings validation properly handles downloader-specific configurations.

src/program/services/scrapers/mediafusion.py (1)

149-149: Verify the impact of simplified title extraction logic.

The change from splitting by both "/" and "\n" to only "\n" might affect how torrent titles are parsed. This could impact downstream components that rely on specific title formats.

Let's analyze the potential impact:

src/program/settings/models.py (2)

286-286: Add documentation for bucket_limit configuration.

While the Field constraints are well-defined, please add documentation explaining:

The purpose of bucket_limit
The impact of different values
The rationale behind the default value of 5 and the limits (0-20)

This will help users understand how to configure this setting effectively.

60-61: Consider adding URL validation for proxy settings.

While the proxy settings follow the established pattern from other models, consider adding URL validation for proxy_url to ensure valid proxy configurations. This is particularly important given the Torbox API access issues mentioned in the PR.

src/program/utils/request.py (1)

151-152: LGTM: Timeout implementation is flexible

The implementation correctly allows per-request timeout override while maintaining a default value.

src/program/program.py (1)

257-261: ⚠️ Potential issue

Critical: Symlink repair functionality has been disabled without alternative

The code comments out the scheduling of fix_broken_symlinks function, which is a critical maintenance feature. While the comment indicates this is temporary, disabling this functionality without an alternative could lead to accumulation of broken symlinks.

Please address the following concerns:

Document the reason for disabling this functionality in the code comments
Provide an estimated timeline for re-enabling this feature
Consider adding a temporary alternative or manual repair procedure

Let's verify the impact of this change:

src/program/services/downloaders/shared.py (1)

50-56: Clarify the request parameter in select_files method

The method signature defines request as list[int], implying it is a list of file IDs. However, the docstring mentions "File selection details including torrent ID and file IDs," suggesting that request includes more than just file IDs.

Please verify whether request should be:

A simple list[int] of file IDs, in which case the torrent_id should be passed separately.
A more complex data structure that includes both the torrent_id and file IDs.

Adjust the method signature and docstring for consistency.

Option 1: If torrent_id is a separate parameter

-def select_files(self, request: list[int]) -> None:
+def select_files(self, torrent_id: str, file_ids: list[int]) -> None:

Option 2: If request is a data class or dictionary

Update the type hint and provide details in the docstring.

src/program/services/downloaders/torbox.py (1)

1-191: Overall code improvement and adherence to best practices

The refactored code enhances modularity and readability by separating API interactions and request handling into dedicated classes. The use of data classes like TorrentContainer, DebridFile, and TorrentInfo promotes better data management.

🧰 Tools

🪛 Ruff (0.8.0)

3-3: typing.List imported but unused

Remove unused import: typing.List

(F401)

188-188: f-string without any placeholders

Remove extraneous f prefix

(F541)

src/program/services/downloaders/alldebrid.py (1)

155-155: Verify parameter name in add_torrent method

In the add_torrent method, the parameter is set as params={"magnets[]": infohash}. Ensure that the API expects the parameter as magnets[] for a single infohash.

Run the following script to confirm the correct parameter name:

✅ Verification successful

Let me search the codebase for more context about the API usage.

Let me search for the complete implementation of the add_torrent method and any related API calls.

Let me check the test file to understand the API usage better.

Let me check the test data files to see the actual API responses.

Parameter name magnets[] is correct for AllDebrid API

Based on the test files and API responses, the parameter name magnets[] is correctly used. This is evidenced by:

Successful test cases in test_alldebrid_downloader.py using this parameter format
Valid API responses in the test data files showing proper handling of magnet uploads
The API consistently returns a magnets array in the response, matching the parameter naming convention

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the parameter name for adding a magnet in AllDebrid API.

# Expected: The API accepts 'magnets[]' as the parameter name.

curl -s 'https://api.alldebrid.com/v4/magnet/upload' \
  -G \
  --data-urlencode "magnets[]=$infohash" \
  -H "Authorization: Bearer $api_key" | jq '.'

# Replace $infohash and $api_key with appropriate values.

Length of output: 271

Script:

#!/bin/bash
# Search for AllDebrid API documentation or usage in the codebase
rg -A 5 "api.alldebrid.com" --type py

# Also search for the add_torrent method implementation
ast-grep --pattern 'def add_torrent($$$)'

Length of output: 607

Script:

#!/bin/bash
# Search for the complete implementation of add_torrent and surrounding context
rg -B 5 -A 15 'params={"magnets\[\]": infohash}' --type py

# Also search for any other usage of magnets parameter
rg "magnets\[\]" --type py

Length of output: 519

Script:

#!/bin/bash
# Get the content of the test file that shows API usage
cat src/tests/test_alldebrid_downloader.py

Length of output: 5497

Script:

#!/bin/bash
# Check the test data files for API response format
cat src/tests/test_data/alldebrid_magnet_upload_not_ready.json
cat src/tests/test_data/alldebrid_magnet_upload_ready.json

Length of output: 1003

coderabbitai · 2024-11-26T19:21:58Z

src/program/services/scrapers/shared.py

+        torrents = sort_torrents(torrents, bucket_limit=bucket_limit)
        torrents_dict = {}
        for torrent in torrents.values():
            torrents_dict[torrent.infohash] = Stream(torrent)
+        logger.log("SCRAPER", f"Kept {len(torrents_dict)} streams for {item.log_string} after processing bucket limit")


🛠️ Refactor suggestion

Document the purpose and impact of bucket_limit.

The bucket_limit parameter is used to filter torrents, but its purpose and impact aren't documented. Additionally, consider validating the bucket_limit value before use.

- torrents = sort_torrents(torrents, bucket_limit=bucket_limit) + # Validate bucket limit + if not isinstance(bucket_limit, int) or bucket_limit < 0: + logger.warning(f"Invalid bucket_limit value: {bucket_limit}, using default of 5") + effective_bucket_limit = 5 + else: + effective_bucket_limit = bucket_limit + + # Group torrents into quality buckets (e.g., 4K, 1080p, etc.) + # and keep only the top N torrents per quality bucket + torrents = sort_torrents(torrents, bucket_limit=effective_bucket_limit)

Also, consider adding a docstring to explain the bucket limit concept:

def sort_torrents(torrents: Set[Torrent], bucket_limit: int) -> Dict[str, Torrent]: """Sort and filter torrents based on quality buckets. Args: torrents: Set of torrents to sort bucket_limit: Maximum number of torrents to keep per quality bucket (e.g., 4K, 1080p) This helps prevent overwhelming downstream systems while maintaining a diverse selection of quality options. Returns: Dict mapping infohash to filtered/sorted Torrent objects """

coderabbitai · 2024-11-26T19:21:58Z

src/routers/secure/scrape.py

+    def get_info_hash(magnet: str) -> str:
+        pattern = r"[A-Fa-f0-9]{40}"
+        match = re.search(pattern, magnet)
+        return match.group(0) if match else None


🛠️ Refactor suggestion

Enhance robustness of info hash extraction

The current implementation has potential issues with error handling and validation. Consider these improvements:

def get_info_hash(magnet: str) -> str: - pattern = r"[A-Fa-f0-9]{40}" - match = re.search(pattern, magnet) - return match.group(0) if match else None + if not magnet.startswith('magnet:?'): + return None + + pattern = r"xt=urn:btih:([A-Fa-f0-9]{40})" + match = re.search(pattern, magnet) + return match.group(1) if match else None

The improved version:

Validates magnet URI format

Uses more specific regex pattern to match only BitTorrent info hashes

Properly extracts the hash group

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def get_info_hash(magnet: str) -> str:

pattern = r"[A-Fa-f0-9]{40}"

match = re.search(pattern, magnet)

return match.group(0) if match else None

def get_info_hash(magnet: str) -> str:

if not magnet.startswith('magnet:?'):

return None

pattern = r"xt=urn:btih:([A-Fa-f0-9]{40})"

match = re.search(pattern, magnet)

return match.group(1) if match else None

coderabbitai · 2024-11-26T19:21:58Z

src/program/utils/request.py

@@ -126,6 +126,7 @@ def __init__(self, session: Session | LimiterSession, response_type: ResponseTyp
        self.BASE_REQUEST_PARAMS = base_params or BaseRequestParameters()
        self.custom_exception = custom_exception or Exception
        self.request_logging = request_logging
+        self.timeout = 15


🛠️ Refactor suggestion

Make timeout configurable via constructor

The hardcoded timeout value of 15 seconds might not be suitable for all use cases. Different endpoints or services may require different timeout values.

Consider making it configurable via the constructor:

def __init__(self, session: Session | LimiterSession, response_type: ResponseType = ResponseType.SIMPLE_NAMESPACE, base_url: Optional[str] = None, base_params: Optional[BaseRequestParameters] = None, - custom_exception: Optional[Type[Exception]] = None, request_logging: bool = False): + custom_exception: Optional[Type[Exception]] = None, request_logging: bool = False, timeout: int = 15): self.session = session self.response_type = response_type self.BASE_URL = base_url self.BASE_REQUEST_PARAMS = base_params or BaseRequestParameters() self.custom_exception = custom_exception or Exception self.request_logging = request_logging - self.timeout = 15 + self.timeout = timeout

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

self.timeout = 15

def __init__(self, session: Session | LimiterSession, response_type: ResponseType = ResponseType.SIMPLE_NAMESPACE, base_url: Optional[str] = None, base_params: Optional[BaseRequestParameters] = None,

custom_exception: Optional[Type[Exception]] = None, request_logging: bool = False, timeout: int = 15):

self.session = session

self.response_type = response_type

self.BASE_URL = base_url

self.BASE_REQUEST_PARAMS = base_params or BaseRequestParameters()

self.custom_exception = custom_exception or Exception

self.request_logging = request_logging

self.timeout = timeout

coderabbitai · 2024-11-26T19:21:58Z

src/program/media/item.py

+        self.type = "show"
        self.locations = item.get("locations", [])
        self.seasons: list[Season] = item.get("seasons", [])
        self.propagate_attributes_to_childs()


⚠️ Potential issue

Fix initialization order in Show constructor.

The propagation of attributes to child elements occurs before the parent class initialization, which could lead to undefined behavior if the parent class initialization modifies any attributes used in the propagation.

def __init__(self, item): - self.type = "show" - self.locations = item.get("locations", []) - self.seasons: list[Season] = item.get("seasons", []) - self.propagate_attributes_to_childs() - super().__init__(item) + super().__init__(item) + self.type = MEDIA_TYPE_SHOW + validate_media_type(self.type) + self.locations = item.get("locations", []) + self.seasons: list[Season] = item.get("seasons", []) + self.propagate_attributes_to_childs()

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-26T19:21:58Z

src/program/services/downloaders/shared.py

+def parse_filename(filename: str) -> ParsedFileData:
+    """Parse a filename into a ParsedFileData object"""
+    parsed_data: ParsedData = parse(filename)
+    season: int | None = parsed_data.seasons[0] if parsed_data.seasons else None
+    return ParsedFileData(item_type=parsed_data.type, season=season, episodes=parsed_data.episodes)
+


🛠️ Refactor suggestion

Add error handling in parse_filename function

The parse_filename function uses the parse function from RTN, which may raise exceptions or return incomplete data if the filename cannot be parsed.

Consider adding error handling to manage potential exceptions and ensure the function behaves reliably.

Example:

def parse_filename(filename: str) -> ParsedFileData: """Parse a filename into a ParsedFileData object""" try: parsed_data: ParsedData = parse(filename) except ParseError as e: # Handle the exception or re-raise with additional context raise ValueError(f"Failed to parse filename '{filename}': {e}") season: Optional[int] = parsed_data.seasons[0] if parsed_data.seasons else None return ParsedFileData(item_type=parsed_data.type, season=season, episodes=parsed_data.episodes)

Ensure you import any necessary exceptions from the RTN module.

coderabbitai · 2024-11-26T19:22:00Z

src/program/services/downloaders/realdebrid.py

+    def get_instant_availability(self, infohash: str, item_type: str) -> Optional[TorrentContainer]:
        """
-        Get instant availability for multiple infohashes with retry logic
-        Required by DownloaderBase
+        Get instant availability for multiple infohashes.
+        Creates a makeshift availability check since Real-Debrid no longer supports instant availability.


🛠️ Refactor suggestion

Update the docstring to match the new method signature

The get_instant_availability method now accepts a single infohash and item_type, but the docstring still refers to 'multiple infohashes'. Please update the docstring to accurately describe the method's functionality.

Apply the following diff to correct the docstring:

def get_instant_availability(self, infohash: str, item_type: str) -> Optional[TorrentContainer]: """ - Get instant availability for multiple infohashes. + Get instant availability for a single infohash. Creates a makeshift availability check since Real-Debrid no longer supports instant availability. """

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def get_instant_availability(self, infohash: str, item_type: str) -> Optional[TorrentContainer]:

"""

Get instant availability for multiple infohashes with retry logic

Required by DownloaderBase

Get instant availability for multiple infohashes.

Creates a makeshift availability check since Real-Debrid no longer supports instant availability.

def get_instant_availability(self, infohash: str, item_type: str) -> Optional[TorrentContainer]:

"""

Get instant availability for a single infohash.

Creates a makeshift availability check since Real-Debrid no longer supports instant availability.

coderabbitai · 2024-11-26T19:22:00Z

src/program/services/downloaders/realdebrid.py

+            data = self.api.request_handler.execute(HttpMethod.GET, f"torrents/info/{torrent_id}")
+            files = {file["id"]: {"filename": file["path"].split("/")[-1], "bytes": file["bytes"]} for file in data["files"]}
+            return TorrentInfo(
+                id=data["id"],
+                name=data["filename"],
+                status=data["status"],
+                infohash=data["hash"],
+                bytes=data["bytes"],
+                created_at=data["added"],
+                alternative_filename=data.get("original_filename", None),
+                progress=data.get("progress", None),
+                files=files,
+            )


⚠️ Potential issue

Handle missing keys in the API response to prevent KeyError

When processing the response data in get_torrent_info, you're accessing keys like "files", "id", "path", and "bytes" directly. If any of these keys are missing in the API response, it could result in a KeyError. Consider adding error handling or default values to manage missing or unexpected data gracefully.

Modify the code to safely access dictionary keys:

data = self.api.request_handler.execute(HttpMethod.GET, f"torrents/info/{torrent_id}") -files = {file["id"]: {"filename": file["path"].split("/")[-1], "bytes": file["bytes"]} for file in data["files"]} +files = { + file.get("id"): { + "filename": file.get("path", "").split("/")[-1], + "bytes": file.get("bytes", 0) + } + for file in data.get("files", []) + if file.get("id") is not None +} return TorrentInfo( - id=data["id"], + id=data.get("id", ""), name=data.get("filename", ""), status=data.get("status", ""), infohash=data.get("hash", ""), bytes=data.get("bytes", 0), created_at=data.get("added", ""), alternative_filename=data.get("original_filename", None), progress=data.get("progress", None), files=files, )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

data = self.api.request_handler.execute(HttpMethod.GET, f"torrents/info/{torrent_id}")

files = {file["id"]: {"filename": file["path"].split("/")[-1], "bytes": file["bytes"]} for file in data["files"]}

return TorrentInfo(

id=data["id"],

name=data["filename"],

status=data["status"],

infohash=data["hash"],

bytes=data["bytes"],

created_at=data["added"],

alternative_filename=data.get("original_filename", None),

progress=data.get("progress", None),

files=files,

)

data = self.api.request_handler.execute(HttpMethod.GET, f"torrents/info/{torrent_id}")

files = {

file.get("id"): {

"filename": file.get("path", "").split("/")[-1],

"bytes": file.get("bytes", 0)

}

for file in data.get("files", [])

if file.get("id") is not None

}

return TorrentInfo(

id=data.get("id", ""),

name=data.get("filename", ""),

status=data.get("status", ""),

infohash=data.get("hash", ""),

bytes=data.get("bytes", 0),

created_at=data.get("added", ""),

alternative_filename=data.get("original_filename", None),

progress=data.get("progress", None),

files=files,

)

coderabbitai · 2024-11-26T19:22:00Z

src/program/managers/event_manager.py

@@ -1,5 +1,7 @@
 import os
+import sys


🛠️ Refactor suggestion

Remove unused import sys

The sys module imported on line 2 is not used in the current codebase. Since it's only referenced in the commented-out code, please remove this unused import to clean up the code.

Apply this diff to remove the unused import:

-import sys

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

import sys

🧰 Tools

🪛 Ruff (0.8.0)

2-2: sys imported but unused

Remove unused import: sys

(F401)

coderabbitai · 2024-11-26T19:22:00Z

src/program/managers/event_manager.py

 import threading
+import time


🛠️ Refactor suggestion

Remove unused import time

The time module imported on line 4 is not used in the current codebase. It was only used in the commented-out code. Please remove this unused import.

Apply this diff to remove the unused import:

-import time

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

import time

🧰 Tools

🪛 Ruff (0.8.0)

4-4: time imported but unused

Remove unused import: time

(F401)

coderabbitai · 2024-11-26T19:22:00Z

src/program/managers/event_manager.py

+    # For debugging purposes we can monitor the execution time of the service. (comment out above and uncomment below)
+    # def submit_job(self, service, program, event=None):
+    #     """
+    #     Submits a job to be executed by the service.
+
+    #     Args:
+    #         service (type): The service class to execute.
+    #         program (Program): The program containing the service.
+    #         item (Event, optional): The event item to process. Defaults to None.
+    #     """
+    #     log_message = f"Submitting service {service.__name__} to be executed"
+    #     if event:
+    #         log_message += f" with {event.log_message}"
+    #     logger.debug(log_message)
+
+    #     cancellation_event = threading.Event()
+    #     executor = self._find_or_create_executor(service)
+
+    #     # Add start time to track execution duration
+    #     start_time = datetime.now()
+
+    #     def _monitor_execution(future):
+    #         """Monitor execution time and log if taking too long"""
+    #         while not future.done():
+    #             execution_time = (datetime.now() - start_time).total_seconds()
+    #             if execution_time > 180:  # 3 minutes
+    #                 current_thread = None
+    #                 for thread in threading.enumerate():
+    #                     if thread.name.startswith(service.__name__) and not thread.name.endswith('_monitor'):
+    #                         current_thread = thread
+    #                         break
+
+    #                 if current_thread:
+    #                     # Get stack frames for the worker thread
+    #                     frames = sys._current_frames()
+    #                     thread_frame = None
+    #                     for thread_id, frame in frames.items():
+    #                         if thread_id == current_thread.ident:
+    #                             thread_frame = frame
+    #                             break
+
+    #                     if thread_frame:
+    #                         stack_trace = ''.join(traceback.format_stack(thread_frame))
+    #                     else:
+    #                         stack_trace = "Could not get stack trace for worker thread"
+    #                 else:
+    #                     stack_trace = "Could not find worker thread"
+
+    #                 logger.warning(
+    #                     f"Service {service.__name__} execution taking longer than 3 minutes!\n"
+    #                     f"Event: {event.log_message if event else 'No event'}\n"
+    #                     f"Execution time: {execution_time:.1f} seconds\n"
+    #                     f"Thread name: {current_thread.name if current_thread else 'Unknown'}\n"
+    #                     f"Thread alive: {current_thread.is_alive() if current_thread else 'Unknown'}\n"
+    #                     f"Stack trace:\n{stack_trace}"
+    #                 )
+
+    #                 # Cancel the future and kill the thread
+    #                 future.cancellation_event.set()
+    #                 future.cancel()
+    #                 if current_thread:
+    #                     logger.warning(f"Killing thread {current_thread.name} due to timeout")
+    #                     self._futures.remove(future)
+    #                     if event:
+    #                         self.remove_event_from_running(event)
+    #                 return  # Exit the monitoring thread
+
+    #             time.sleep(60)  # Check every minute
+
+    #     future = executor.submit(db_functions.run_thread_with_db_item, 
+    #                         program.all_services[service].run, 
+    #                         service, program, event, cancellation_event)
+
+    #     # Start monitoring thread
+    #     monitor_thread = threading.Thread(
+    #         target=_monitor_execution, 
+    #         args=(future,),
+    #         name=f"{service.__name__}_monitor",
+    #         daemon=True
+    #     )
+    #     monitor_thread.start()
+
+    #     future.cancellation_event = cancellation_event
+    #     if event:
+    #         future.event = event
+    #     self._futures.append(future)
+    #     sse_manager.publish_event("event_update", self.get_event_updates())
+    #     future.add_done_callback(lambda f: self._process_future(f, service))
+


🛠️ Refactor suggestion

Remove large block of commented-out code

Lines 190-278 contain a large block of code that is entirely commented out. Keeping such large blocks of commented-out code can clutter the codebase and reduce readability. If this code is no longer needed, please remove it. If it's intended for future use, consider using version control branches or feature toggles.

Apply this diff to remove the commented-out code:

- # For debugging purposes we can monitor the execution time of the service. (comment out above and uncomment below) - # def submit_job(self, service, program, event=None): - # """ - # Submits a job to be executed by the service. - # ... - # future.add_done_callback(lambda f: self._process_future(f, service))

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# For debugging purposes we can monitor the execution time of the service. (comment out above and uncomment below)

# def submit_job(self, service, program, event=None):

# """

# Submits a job to be executed by the service.

# Args:

# service (type): The service class to execute.

# program (Program): The program containing the service.

# item (Event, optional): The event item to process. Defaults to None.

# """

# log_message = f"Submitting service {service.__name__} to be executed"

# if event:

# log_message += f" with {event.log_message}"

# logger.debug(log_message)

# cancellation_event = threading.Event()

# executor = self._find_or_create_executor(service)

# # Add start time to track execution duration

# start_time = datetime.now()

# def _monitor_execution(future):

# """Monitor execution time and log if taking too long"""

# while not future.done():

# execution_time = (datetime.now() - start_time).total_seconds()

# if execution_time > 180: # 3 minutes

# current_thread = None

# for thread in threading.enumerate():

# if thread.name.startswith(service.__name__) and not thread.name.endswith('_monitor'):

# current_thread = thread

# break

# if current_thread:

# # Get stack frames for the worker thread

# frames = sys._current_frames()

# thread_frame = None

# for thread_id, frame in frames.items():

# if thread_id == current_thread.ident:

# thread_frame = frame

# break

# if thread_frame:

# stack_trace = ''.join(traceback.format_stack(thread_frame))

# else:

# stack_trace = "Could not get stack trace for worker thread"

# else:

# stack_trace = "Could not find worker thread"

# logger.warning(

# f"Service {service.__name__} execution taking longer than 3 minutes!\n"

# f"Event: {event.log_message if event else 'No event'}\n"

# f"Execution time: {execution_time:.1f} seconds\n"

# f"Thread name: {current_thread.name if current_thread else 'Unknown'}\n"

# f"Thread alive: {current_thread.is_alive() if current_thread else 'Unknown'}\n"

# f"Stack trace:\n{stack_trace}"

# )

# # Cancel the future and kill the thread

# future.cancellation_event.set()

# future.cancel()

# if current_thread:

# logger.warning(f"Killing thread {current_thread.name} due to timeout")

# self._futures.remove(future)

# if event:

# self.remove_event_from_running(event)

# return # Exit the monitoring thread

# time.sleep(60) # Check every minute

# future = executor.submit(db_functions.run_thread_with_db_item,

# program.all_services[service].run,

# service, program, event, cancellation_event)

# # Start monitoring thread

# monitor_thread = threading.Thread(

# target=_monitor_execution,

# args=(future,),

# name=f"{service.__name__}_monitor",

# daemon=True

# )

# monitor_thread.start()

# future.cancellation_event = cancellation_event

# if event:

# future.event = event

# self._futures.append(future)

# sse_manager.publish_event("event_update", self.get_event_updates())

# future.add_done_callback(lambda f: self._process_future(f, service))

coderabbitai

Actionable comments posted: 8

🧹 Outside diff range and nitpick comments (25)

src/program/services/downloaders/shared.py (2)

3-3: Remove unused import List

The List import from typing is not used in this file.
-from typing import List, Optional
+from typing import Optional
🧰 Tools

🪛 Ruff (0.8.0)

3-3: typing.List imported but unused

Remove unused import: typing.List

(F401)

88-92: Add input validation to parse_filename

The function should validate that the input filename is not empty or None before processing.
 def parse_filename(filename: str) -> ParsedFileData:
     """Parse a filename into a ParsedFileData object"""
+    if not filename:
+        raise ValueError("Filename cannot be empty")
     parsed_data: ParsedData = parse(filename)
     season: int | None = parsed_data.seasons[0] if parsed_data.seasons else None
     return ParsedFileData(item_type=parsed_data.type, season=season, episodes=parsed_data.episodes)

src/program/state_transition.py (6)

Line range hint 1-24: Add type hints and implement TODO comment.

The function signature could be improved with explicit return type hints. Additionally, the TODO comment about reindexing non-released badly indexed items needs implementation.

-def process_event(emitted_by: Service, existing_item: MediaItem | None = None, content_item: MediaItem | None = None) -> ProcessedEvent:
+def process_event(
+    emitted_by: Service,
+    existing_item: MediaItem | None = None,
+    content_item: MediaItem | None = None
+) -> tuple[Service | None, list[MediaItem]]:

Would you like me to help implement the TODO for reindexing non-released badly indexed items?

Line range hint 26-36: Add safeguards for recursive processing and consistent state handling.

The recursive processing of seasons and episodes could be improved:

Consider adding a depth limit to prevent stack overflow.
Add consistent handling of Unreleased state for both seasons and episodes.
Add validation before accessing seasons/episodes.

 elif existing_item is not None and existing_item.last_state in [States.PartiallyCompleted, States.Ongoing]:
     if existing_item.type == "show":
+        if not hasattr(existing_item, 'seasons'):
+            logger.warning(f"Show {existing_item.log_string} has no seasons")
+            return no_further_processing
         for season in existing_item.seasons:
             if season.last_state not in [States.Completed, States.Unreleased]:
                 _, sub_items = process_event(emitted_by, season, None)
                 items_to_submit += sub_items
     elif existing_item.type == "season":
+        if not hasattr(existing_item, 'episodes'):
+            logger.warning(f"Season {existing_item.log_string} has no episodes")
+            return no_further_processing
         for episode in existing_item.episodes:
-            if episode.last_state != States.Completed:
+            if episode.last_state not in [States.Completed, States.Unreleased]:
                 _, sub_items = process_event(emitted_by, episode, None)
                 items_to_submit += sub_items

Line range hint 38-45: Improve type handling and readability for Indexed state.

The current implementation could be enhanced with:

Explicit handling of episode type.
Error handling for should_submit() calls.
More readable list comprehensions.

 elif existing_item is not None and existing_item.last_state == States.Indexed:
     next_service = Scraping
-    if emitted_by != Scraping and Scraping.should_submit(existing_item):
-        items_to_submit = [existing_item]
-    elif existing_item.type == "show":
-        items_to_submit = [s for s in existing_item.seasons if s.last_state != States.Completed and Scraping.should_submit(s)]
-    elif existing_item.type == "season":
-        items_to_submit = [e for e in existing_item.episodes if e.last_state != States.Completed and Scraping.should_submit(e)]
+    try:
+        if emitted_by != Scraping and Scraping.should_submit(existing_item):
+            items_to_submit = [existing_item]
+        elif existing_item.type == "show":
+            items_to_submit = [
+                season for season in existing_item.seasons
+                if season.last_state != States.Completed and Scraping.should_submit(season)
+            ]
+        elif existing_item.type == "season":
+            items_to_submit = [
+                episode for episode in existing_item.episodes
+                if episode.last_state != States.Completed and Scraping.should_submit(episode)
+            ]
+        elif existing_item.type == "episode":
+            items_to_submit = [existing_item] if Scraping.should_submit(existing_item) else []
+    except Exception as e:
+        logger.error(f"Error processing {existing_item.log_string}: {str(e)}")
+        return no_further_processing

Line range hint 47-54: Add error handling for downloader transitions.

Given the reported issues with torrent information retrieval (as mentioned in PR comments), the transition to Downloader service needs more robust error handling.

 elif existing_item is not None and existing_item.last_state == States.Scraped:
     next_service = Downloader
+    if not hasattr(existing_item, 'torrent_info') or not existing_item.torrent_info:
+        logger.error(f"Missing torrent information for {existing_item.log_string}")
+        return no_further_processing
     items_to_submit = [existing_item]

Line range hint 56-75: Refactor Completed state handling for better readability.

The Completed state handling has several issues:

Complex nested conditions reduce readability.
Duplicate logic for type checking.
Inconsistent indentation in the else block.

 elif existing_item is not None and existing_item.last_state == States.Completed:
-    # If a user manually retries an item, lets not notify them again
-    if emitted_by not in ["RetryItem", PostProcessing]:
-        notify(existing_item)
-    # Avoid multiple post-processing runs
-    if emitted_by != PostProcessing:
-        if settings_manager.settings.post_processing.subliminal.enabled:
-            next_service = PostProcessing
-            if existing_item.type in ["movie", "episode"] and Subliminal.should_submit(existing_item):
-                items_to_submit = [existing_item]
-            elif existing_item.type == "show":
-                items_to_submit = [e for s in existing_item.seasons for e in s.episodes if e.last_state == States.Completed and Subliminal.should_submit(e)]
-            elif existing_item.type == "season":
-                items_to_submit = [e for e in existing_item.episodes if e.last_state == States.Completed and Subliminal.should_submit(e)]
-            if not items_to_submit:
-                return no_further_processing
-    else:
-        return no_further_processing
+    def get_submittable_episodes(item: MediaItem) -> list[MediaItem]:
+        if item.type == "show":
+            return [
+                episode
+                for season in item.seasons
+                for episode in season.episodes
+                if episode.last_state == States.Completed and Subliminal.should_submit(episode)
+            ]
+        elif item.type == "season":
+            return [
+                episode
+                for episode in item.episodes
+                if episode.last_state == States.Completed and Subliminal.should_submit(episode)
+            ]
+        return []
+
+    # Skip notification for manual retries
+    if emitted_by not in ["RetryItem", PostProcessing]:
+        notify(existing_item)
+
+    # Skip if already in post-processing
+    if emitted_by == PostProcessing:
+        return no_further_processing
+
+    # Handle post-processing
+    if not settings_manager.settings.post_processing.subliminal.enabled:
+        return no_further_processing
+
+    next_service = PostProcessing
+    if existing_item.type in ["movie", "episode"]:
+        items_to_submit = [existing_item] if Subliminal.should_submit(existing_item) else []
+    else:
+        items_to_submit = get_submittable_episodes(existing_item)
+
+    return no_further_processing if not items_to_submit else (next_service, items_to_submit)

Line range hint 77-79: Enable and enhance debug logging.

Given the reported issues with torrent information retrieval, having proper logging enabled would be beneficial for debugging. Consider enhancing it with additional context.

-    # if items_to_submit and next_service:
-    #     for item in items_to_submit:
-    #         logger.debug(f"Submitting {item.log_string} ({item.id}) to {next_service if isinstance(next_service, str) else next_service.__name__}")
+    if items_to_submit and next_service:
+        service_name = next_service if isinstance(next_service, str) else next_service.__name__
+        logger.debug(
+            f"State transition: {existing_item.last_state if existing_item else 'None'} -> {service_name}"
+            f" | Items: {[f'{item.log_string} ({item.id})' for item in items_to_submit]}"
+        )

src/program/services/downloaders/models.py (4)

4-4: Remove unused import

The loguru.logger import is not used in this file.
-from loguru import logger
🧰 Tools

🪛 Ruff (0.8.0)

4-4: loguru.logger imported but unused

Remove unused import: loguru.logger

(F401)

43-64: Consider adding a filesize_mb property for consistency

For consistency with TorrentInfo, consider adding a filesize_mb property to convert bytes to megabytes.
+    @property
+    def filesize_mb(self) -> float:
+        """Convert bytes to megabytes"""
+        return self.filesize / 1_000_000 if self.filesize else 0
🧰 Tools

🪛 Ruff (0.8.0)

52-52: Test for membership should be not in

Convert to not in

(E713)

59-60: Use a single if statement instead of nested if statements

(SIM102)

73-87: Consider adding infohash validation

The model could benefit from infohash validation to ensure it matches the expected format.
class TorrentContainer(BaseModel):
    """Represents a collection of files from an infohash from a debrid service"""
-    infohash: str
+    infohash: str = Field(..., min_length=40, max_length=40, pattern="^[a-fA-F0-9]+$")
103-106: Handle None bytes in size_mb property

The size_mb property should handle None bytes gracefully.
    @property
    def size_mb(self) -> float:
        """Convert bytes to megabytes"""
-        return self.bytes / 1_000_000
+        return self.bytes / 1_000_000 if self.bytes else 0

src/program/services/downloaders/__init__.py (4)

68-71: Enhance error handling with specific exception types

The generic Exception catch block could mask specific errors. Consider handling specific exceptions separately for better error reporting and recovery.

Consider this structure:

-            except Exception as e:
-                logger.debug(f"Stream {stream.infohash} failed: {e}")
-                if 'download_result' in locals() and download_result.id:
-                    self.service.delete_torrent(download_result.id)
+            except NotCachedException as e:
+                logger.debug(f"Stream {stream.infohash} not cached: {e}")
+            except NoMatchingFilesException as e:
+                logger.debug(f"No matching files for stream {stream.infohash}: {e}")
+            except Exception as e:
+                logger.error(f"Unexpected error for stream {stream.infohash}: {e}")
+            finally:
+                if 'download_result' in locals() and download_result.id:
+                    self.service.delete_torrent(download_result.id)

79-104: Enhance stream validation logging

The validation logic is solid, but adding more detailed logging would help with debugging and monitoring.

Add more detailed logging:

     def validate_stream(self, stream: Stream, item: MediaItem) -> Optional[TorrentContainer]:
         """
         Validate a single stream by ensuring its files match the item's requirements.
         """
+        logger.debug(f"Validating stream {stream.infohash} for {item.log_string}")
         container = self.get_instant_availability(stream.infohash, item.type)
         if not container:
+            logger.debug(f"Stream {stream.infohash} is not cached")
             item.blacklist_stream(stream)
             return None

         valid_files = []
         for file in container.files or []:
+            logger.debug(f"Validating file {file.filename}")
             debrid_file = DebridFile.create(
                 filename=file.filename,
                 filesize_bytes=file.filesize,
                 filetype=item.type,
                 file_id=file.file_id
             )
             if debrid_file:
+                logger.debug(f"File {file.filename} is valid")
                 valid_files.append(debrid_file)

         if valid_files:
+            logger.debug(f"Found {len(valid_files)} valid files for stream {stream.infohash}")
             container.files = valid_files
             return container

+        logger.debug(f"No valid files found for stream {stream.infohash}")
         item.blacklist_stream(stream)
         return None

106-119: Add type hints for container attribute

The container attribute's type is not explicitly defined in the method signature.

Add type hints for better code clarity:

-    def update_item_attributes(self, item: MediaItem, download_result: DownloadedTorrent) -> bool:
+    def update_item_attributes(self, item: MediaItem, download_result: DownloadedTorrent) -> bool:
         """Update the item attributes with the downloaded files and active stream."""
-        if not download_result.container:
+        if not download_result.container: # type: Optional[TorrentContainer]
             raise NotCachedException(f"No container found for {item.log_string} ({item.id})")

120-138: Simplify file matching logic with early returns

The nested conditionals can be simplified for better readability.

Consider this simplified structure:

     def match_file_to_item(self, item: MediaItem, file_data: ParsedFileData, file: DebridFile, download_result: DownloadedTorrent) -> bool:
         """Check if the file matches the item and update attributes."""
-        found = False
         if item.type == "movie" and file_data.item_type == "movie":
             self._update_attributes(item, file, download_result)
             return True

         if item.type not in ("show", "season", "episode"):
+            return False

-        if item.type in ("show", "season", "episode"):
-            if not (file_data.season and file_data.episodes):
-                return False
+        if not (file_data.season and file_data.episodes):
+            return False

         show: Show = item if item.type == "show" else (item.parent if item.type == "season" else item.parent.parent)
         season: Season = next((season for season in show.seasons if season.number == file_data.season), None)
+        if not season:
+            return False

+        found = False
         for file_episode in file_data.episodes:
             episode: Episode = next((episode for episode in season.episodes if episode.number == file_episode), None)
             if episode and episode.state not in [States.Completed, States.Symlinked, States.Downloaded]:
                 self._update_attributes(episode, file, download_result)
                 found = True

         return found

src/program/services/downloaders/torbox.py (5)

3-3: Remove unused import

The List type hint from typing is imported but never used in the code.
-from typing import List, Optional, Union
+from typing import Optional, Union
🧰 Tools

🪛 Ruff (0.8.0)

3-3: typing.List imported but unused

Remove unused import: typing.List

(F401)

24-36: Consider removing or implementing the commented enum

The TBTorrentStatus enum is commented out but might be useful for type safety when handling torrent statuses. Either implement it or remove it to reduce code noise.

55-55: Consider making the BASE_URL configurable

The API endpoint is hardcoded. Consider making it configurable through settings to handle potential endpoint changes or different environments.
-    BASE_URL = "https://api.torbox.app/v1/api"
+    BASE_URL = settings_manager.settings.downloaders.torbox.api_url or "https://api.torbox.app/v1/api"
170-172: Implement or document the stub method

The select_files method is currently a stub. Consider either implementing it or adding a docstring explaining why it's not implemented.
     def select_files(self, *args) -> None:
-        """Select files from a torrent"""
-        pass
+        """
+        Select files from a torrent.
+        Not implemented for TorBox as file selection is handled automatically.
+        """
+        logger.debug("File selection not required for TorBox")
192-192: Remove unnecessary f-string

The string doesn't contain any placeholders, so the f-string prefix is unnecessary.
-            self.api.request_handler.execute(HttpMethod.POST, f"torrents/controltorrent", json={"torrent_id": str(torrent_id), "operation": "delete"}, timeout=15)
+            self.api.request_handler.execute(HttpMethod.POST, "torrents/controltorrent", json={"torrent_id": str(torrent_id), "operation": "delete"}, timeout=15)
🧰 Tools

🪛 Ruff (0.8.0)

192-192: f-string without any placeholders

Remove extraneous f prefix

(F541)

src/program/services/downloaders/realdebrid.py (3)

143-149: Enhance error handling specificity

The broad Exception catch block could mask specific issues. Consider catching and handling specific exceptions that might occur during torrent operations (e.g., RealDebridError, ValueError, etc.).

     try:
         torrent_id = self.add_torrent(infohash)
         container = self._process_torrent(torrent_id, infohash, item_type)
         if container:
             valid_container = container
-    except Exception as e:
-        logger.error(f"Failed to get instant availability for {infohash}: {e}")
+    except RealDebridError as e:
+        logger.error(f"Real-Debrid API error for {infohash}: {e}")
+    except ValueError as e:
+        logger.error(f"Invalid data received for {infohash}: {e}")
+    except Exception as e:
+        logger.error(f"Unexpected error getting availability for {infohash}: {e}")

Line range hint 194-206: Add input validation for infohash

The method should validate the infohash format before constructing the magnet URI to prevent potential API errors.

     def add_torrent(self, infohash: str) -> str:
         """Add a torrent by infohash"""
         try:
+            if not infohash or not isinstance(infohash, str) or not len(infohash) == 40:
+                raise ValueError(f"Invalid infohash format: {infohash}")
             magnet = f"magnet:?xt=urn:btih:{infohash}"
             response = self.api.request_handler.execute(
                 HttpMethod.POST,
                 "torrents/addMagnet",
                 data={"magnet": magnet.lower()}
             )
             return response["id"]
+        except ValueError as e:
+            logger.error(str(e))
+            raise
         except Exception as e:
             logger.error(f"Failed to add torrent {infohash}: {e}")
             raise

242-248: Add response validation for delete operation

Verify the success of the delete operation by checking the response status.

     def delete_torrent(self, torrent_id: str) -> None:
         """Delete a torrent"""
         try:
-            self.api.request_handler.execute(HttpMethod.DELETE, f"torrents/delete/{torrent_id}")
+            if not torrent_id:
+                raise ValueError("Torrent ID is required")
+            response = self.api.request_handler.execute(
+                HttpMethod.DELETE,
+                f"torrents/delete/{torrent_id}"
+            )
+            # Real-Debrid returns 204 on successful deletion
+            if response != {}:
+                logger.warning(f"Unexpected response when deleting torrent {torrent_id}")
         except Exception as e:
             logger.error(f"Failed to delete torrent {torrent_id}: {e}")
             raise

src/routers/secure/scrape.py (1)

249-251: Enhance error message for invalid magnet URIs

While the error handling is good, the error message could be more descriptive to help users understand why their magnet URI was rejected.
-        raise HTTPException(status_code=400, detail="Invalid magnet URI")
+        raise HTTPException(
+            status_code=400,
+            detail="Invalid magnet URI: Could not extract valid info hash. Please ensure the magnet URI contains a valid BitTorrent info hash."
+        )

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between d25c01d and 15aa7d6.

⛔ Files ignored due to path filters (1)

poetry.lock is excluded by !**/*.lock

📒 Files selected for processing (10)

src/program/services/downloaders/__init__.py (2 hunks)
src/program/services/downloaders/alldebrid.py (8 hunks)
src/program/services/downloaders/models.py (1 hunks)
src/program/services/downloaders/realdebrid.py (4 hunks)
src/program/services/downloaders/shared.py (1 hunks)
src/program/services/downloaders/torbox.py (1 hunks)
src/program/services/libraries/symlink.py (1 hunks)
src/program/state_transition.py (1 hunks)
src/program/types.py (2 hunks)
src/routers/secure/scrape.py (2 hunks)

✅ Files skipped from review due to trivial changes (1)

src/program/services/libraries/symlink.py

🚧 Files skipped from review as they are similar to previous changes (1)

src/program/types.py

🧰 Additional context used

🪛 Ruff (0.8.0)

src/program/services/downloaders/alldebrid.py

3-3: typing.Dict imported but unused

Remove unused import

(F401)

3-3: typing.Iterator imported but unused

Remove unused import

(F401)

3-3: typing.Tuple imported but unused

Remove unused import

(F401)

10-10: program.services.downloaders.models.VIDEO_EXTENSIONS imported but unused

Remove unused import: program.services.downloaders.models.VIDEO_EXTENSIONS

(F401)

146-146: return inside finally blocks cause exceptions to be silenced

(B012)

src/program/services/downloaders/models.py

4-4: loguru.logger imported but unused

Remove unused import: loguru.logger

(F401)

52-52: Test for membership should be not in

Convert to not in

(E713)

59-60: Use a single if statement instead of nested if statements

(SIM102)

src/program/services/downloaders/shared.py

3-3: typing.List imported but unused

Remove unused import: typing.List

(F401)

src/program/services/downloaders/torbox.py

3-3: typing.List imported but unused

Remove unused import: typing.List

(F401)

192-192: f-string without any placeholders

Remove extraneous f prefix

(F541)

🔇 Additional comments (12)

src/program/services/downloaders/shared.py (1)

28-38: LGTM! Well-structured method signature and documentation.

The updated signature with proper type hints and comprehensive docstring improves code clarity and maintainability.

src/program/services/downloaders/models.py (5)

9-18: LGTM! Well-structured video extensions handling

The implementation provides a good balance of flexibility and safety:

Defines allowed extensions
Allows customization through settings
Falls back to defaults if settings are invalid

20-33: LGTM! Robust file size constraints implementation

The implementation handles edge cases well:

Negative values are handled gracefully
Uses infinity for unbounded maximums
Clear separation between movie and episode constraints

36-40: LGTM! Well-documented custom exceptions

Clear and purposeful exception classes with descriptive docstrings.

66-71: LGTM! Clean data model

Well-structured model with appropriate field types and defaults.

109-114: LGTM! Clear composite model

Well-structured model that cleanly combines TorrentContainer and TorrentInfo.

src/program/services/downloaders/torbox.py (1)

38-52: LGTM! Well-structured request handling

The implementation includes proper error handling with custom exceptions and appropriate status code checks.

src/routers/secure/scrape.py (2)

244-247: Previous review comment about enhancing info hash extraction is still valid

The current implementation remains basic and potentially fragile. The previous review comment suggesting improvements to validate magnet URIs and use more specific regex patterns is still applicable.

Line range hint 285-297: Enhance error handling for downloader service interactions

Given the reported issues with Torbox API, the error handling for downloader service interactions should be more robust. Consider:

Adding specific error handling for API failures
Implementing retry mechanism for transient failures
Providing more detailed error messages to help diagnose API issues

     try:
         torrent_id = downloader.add_torrent(info_hash)
         torrent_info = downloader.get_torrent_info(torrent_id)
         containers = downloader.get_instant_availability([session.magnet]).get(session.magnet, None)
         session_manager.update_session(session.id, torrent_id=torrent_id, torrent_info=torrent_info, containers=containers)
     except Exception as e:
+        logger.error(f"Failed to interact with downloader service: {str(e)}")
+        error_detail = "Failed to process torrent"
+        if "404" in str(e):
+            error_detail = "Torrent information not found. The API endpoint might be incorrect or the resource might not exist."
+        elif "timeout" in str(e).lower():
+            error_detail = "Request timed out. The service might be temporarily unavailable."
         background_tasks.add_task(session_manager.abort_session, session.id)
-        raise HTTPException(status_code=500, detail=str(e))
+        raise HTTPException(status_code=500, detail=error_detail)

Let's verify the API endpoints being used:

✅ Verification successful

Existing error handling is sufficient, but could benefit from error details propagation

The review comment's suggestion for enhanced error handling is partially valid. The downloader service already implements:

Proper error logging for all API interactions
Retry mechanism for availability checks (MAX_RETRIES and RETRY_DELAY)
Specific error handling for timeouts

However, these detailed error messages aren't propagated to the API response. The suggested changes to propagate more specific error details to the client are valuable:

     except Exception as e:
+        logger.error(f"Failed to interact with downloader service: {str(e)}")
+        error_detail = "Failed to process torrent"
+        if "404" in str(e):
+            error_detail = "Torrent information not found. The API endpoint might be incorrect or the resource might not exist."
+        elif "timeout" in str(e).lower():
+            error_detail = "Request timed out. The service might be temporarily unavailable."
         background_tasks.add_task(session_manager.abort_session, session.id)
-        raise HTTPException(status_code=500, detail=str(e))
+        raise HTTPException(status_code=500, detail=error_detail)

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for API endpoint configurations and usage
rg -A 2 "api\.torbox\.app"

Length of output: 778

Script:

#!/bin/bash
# Let's examine the downloader service implementation to understand the error handling
ast-grep --pattern 'class TorboxDownloader {
  $$$
  add_torrent($_) {
    $$$
  }
  $$$
}'

# Also check for any existing error handling patterns
rg -B2 -A4 "except.*Exception" src/program/services/downloaders/

Length of output: 11232

src/program/services/downloaders/alldebrid.py (3)

146-146: Avoid returning inside a finally block to prevent exception suppression

🧰 Tools

🪛 Ruff (0.8.0)

146-146: return inside finally blocks cause exceptions to be silenced

(B012)

206-206: Fix incorrect progress calculation in get_torrent_info

234-234: Ensure correct return type in get_files_and_links

coderabbitai · 2024-11-26T19:31:31Z

src/program/services/downloaders/models.py

+    @classmethod
+    def create(cls, filename: str, filesize_bytes: int, filetype: Literal["movie", "episode"], file_id: Optional[int] = None) -> Optional["DebridFile"]:
+        """Factory method to validate and create a DebridFile"""
+        if not any(filename.endswith(ext) for ext in VIDEO_EXTENSIONS) and not "sample" in filename.lower():


⚠️ Potential issue

Fix the logical condition and membership test

The condition has incorrect logic and syntax. It should reject files that either don't have allowed extensions or contain 'sample'.

-if not any(filename.endswith(ext) for ext in VIDEO_EXTENSIONS) and not "sample" in filename.lower(): +if not any(filename.endswith(ext) for ext in VIDEO_EXTENSIONS) or "sample" in filename.lower():

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools

🪛 Ruff (0.8.0)

52-52: Test for membership should be not in

Convert to not in

(E713)

coderabbitai · 2024-11-26T19:31:31Z

src/program/services/downloaders/__init__.py

            RealDebridDownloader: RealDebridDownloader(),
-            AllDebridDownloader: AllDebridDownloader(),
-            # TorBoxDownloader: TorBoxDownloader()
+            TorBoxDownloader: TorBoxDownloader(),
+            AllDebridDownloader: AllDebridDownloader()
        }
-        self.service = next(
-            (service for service in self.services.values() if service.initialized), None
-        )
-
+        self.service = next((service for service in self.services.values() if service.initialized), None)


🛠️ Refactor suggestion

Consider using a more robust service selection strategy

The current service selection using next() on an unordered dictionary could lead to unpredictable behavior. Additionally, the initialization doesn't provide visibility into which service was selected.

Consider this improved implementation:

- self.services = { - RealDebridDownloader: RealDebridDownloader(), - TorBoxDownloader: TorBoxDownloader(), - AllDebridDownloader: AllDebridDownloader() - } - self.service = next((service for service in self.services.values() if service.initialized), None) + self.services = [ + RealDebridDownloader(), + TorBoxDownloader(), + AllDebridDownloader() + ] + self.service = None + for service in self.services: + if service.initialized: + self.service = service + logger.info(f"Using {service.__class__.__name__} as the downloader service") + break

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

RealDebridDownloader: RealDebridDownloader(),

AllDebridDownloader: AllDebridDownloader(),

# TorBoxDownloader: TorBoxDownloader()

TorBoxDownloader: TorBoxDownloader(),

AllDebridDownloader: AllDebridDownloader()

}

self.service = next(

(service for service in self.services.values() if service.initialized), None

)

self.service = next((service for service in self.services.values() if service.initialized), None)

self.services = [

RealDebridDownloader(),

TorBoxDownloader(),

AllDebridDownloader()

]

self.service = None

for service in self.services:

if service.initialized:

self.service = service

logger.info(f"Using {service.__class__.__name__} as the downloader service")

break

coderabbitai · 2024-11-26T19:31:31Z

src/program/services/downloaders/__init__.py

+    def download_cached_stream(self, stream: Stream, container: TorrentContainer) -> DownloadedTorrent:
+        """Download a cached stream"""
+        torrent_id: int = self.add_torrent(stream.infohash)
+        info: TorrentInfo = self.get_torrent_info(torrent_id)
+        if container.file_ids:
+            self.select_files(torrent_id, container.file_ids)
+        return DownloadedTorrent(id=torrent_id, info=info, infohash=stream.infohash, container=container)


🛠️ Refactor suggestion

Add error handling for service method calls

The service method calls could fail if the service is not properly initialized or encounters network issues.

Consider adding error handling:

def download_cached_stream(self, stream: Stream, container: TorrentContainer) -> DownloadedTorrent: """Download a cached stream""" + if not self.service: + raise RuntimeError("No downloader service initialized") + + try: torrent_id: int = self.add_torrent(stream.infohash) info: TorrentInfo = self.get_torrent_info(torrent_id) if container.file_ids: self.select_files(torrent_id, container.file_ids) return DownloadedTorrent(id=torrent_id, info=info, infohash=stream.infohash, container=container) + except Exception as e: + logger.error(f"Failed to download stream {stream.infohash}: {e}") + if 'torrent_id' in locals(): + self.delete_torrent(torrent_id) + raise

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def download_cached_stream(self, stream: Stream, container: TorrentContainer) -> DownloadedTorrent:

"""Download a cached stream"""

torrent_id: int = self.add_torrent(stream.infohash)

info: TorrentInfo = self.get_torrent_info(torrent_id)

if container.file_ids:

self.select_files(torrent_id, container.file_ids)

return DownloadedTorrent(id=torrent_id, info=info, infohash=stream.infohash, container=container)

def download_cached_stream(self, stream: Stream, container: TorrentContainer) -> DownloadedTorrent:

"""Download a cached stream"""

if not self.service:

raise RuntimeError("No downloader service initialized")

try:

torrent_id: int = self.add_torrent(stream.infohash)

info: TorrentInfo = self.get_torrent_info(torrent_id)

if container.file_ids:

self.select_files(torrent_id, container.file_ids)

return DownloadedTorrent(id=torrent_id, info=info, infohash=stream.infohash, container=container)

except Exception as e:

logger.error(f"Failed to download stream {stream.infohash}: {e}")

if 'torrent_id' in locals():

self.delete_torrent(torrent_id)

raise

coderabbitai · 2024-11-26T19:31:31Z

src/program/services/downloaders/torbox.py

+            except Exception as e:
+                logger.debug(f"Failed to get instant availability (attempt {attempt + 1}/{self.MAX_RETRIES}): {e}")
+                if attempt < self.MAX_RETRIES - 1:
+                    time.sleep(self.RETRY_DELAY)
+                continue


🛠️ Refactor suggestion

Use specific exception handling

The broad exception catch could mask important errors. Consider catching specific exceptions for better error handling.

- except Exception as e: + except TorBoxError as e: + logger.debug(f"TorBox API error (attempt {attempt + 1}/{self.MAX_RETRIES}): {e}") + if attempt < self.MAX_RETRIES - 1: + time.sleep(self.RETRY_DELAY) + continue + except (ValueError, KeyError) as e: + logger.debug(f"Response parsing error (attempt {attempt + 1}/{self.MAX_RETRIES}): {e}") + if attempt < self.MAX_RETRIES - 1: + time.sleep(self.RETRY_DELAY) + continue

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-26T19:31:32Z

src/program/services/downloaders/realdebrid.py

+    def select_files(self, torrent_id: str, ids: List[int] = None) -> None:
+        """Select files from a torrent"""
        try:
+            selection = ",".join(str(file_id) for file_id in ids) if ids else "all"
            self.api.request_handler.execute(
                HttpMethod.POST,
                f"torrents/selectFiles/{torrent_id}",
-                data={"files": ",".join(files)}
+                data={"files": selection}
            )
+            time.sleep(1)


🛠️ Refactor suggestion

Optimize sleep delay and add input validation

The fixed sleep delay could be replaced with a more dynamic approach, and the method should validate input parameters.

- def select_files(self, torrent_id: str, ids: List[int] = None) -> None: + def select_files(self, torrent_id: str, ids: Optional[List[int]] = None) -> None: """Select files from a torrent""" try: + if not torrent_id: + raise ValueError("Torrent ID is required") + if ids is not None and not all(isinstance(id, int) for id in ids): + raise ValueError("File IDs must be integers") selection = ",".join(str(file_id) for file_id in ids) if ids else "all" self.api.request_handler.execute( HttpMethod.POST, f"torrents/selectFiles/{torrent_id}", data={"files": selection} ) - time.sleep(1) + # Use rate limit parameters from the API configuration + time.sleep(self.api.session.rate_limiter.delay) except Exception as e: logger.error(f"Failed to select files for torrent {torrent_id}: {e}") raise

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-26T19:31:32Z

src/program/services/downloaders/realdebrid.py

+    def _process_torrent(self, torrent_id: str, infohash: str, item_type: str) -> Optional[TorrentContainer]:
+        """Process a single torrent and return a TorrentContainer if valid."""
+        torrent_info = self.get_torrent_info(torrent_id)
+
+        if torrent_info.status == "waiting_files_selection":
+            video_file_ids = [
+                file_id for file_id, file_info in torrent_info.files.items()
+                if file_info["filename"].endswith(tuple(ext.lower() for ext in VIDEO_EXTENSIONS))
+            ]
+
+            if not video_file_ids:
+                logger.debug(f"No video files found in torrent {torrent_id} with infohash {infohash}")
+                return None
+
+            self.select_files(torrent_id, video_file_ids)
+            torrent_info = self.get_torrent_info(torrent_id)
+
+            if torrent_info.status != "downloaded":
+                logger.debug(f"Torrent {torrent_id} with infohash {infohash} is not cached")
+                return None
+
+        if not torrent_info.files:
+            return None
+
+        torrent_files = [
+            file for file in (
+                DebridFile.create(
+                    file_info["filename"],
+                    file_info["bytes"],
+                    item_type,
+                    file_id
                )
-
-                # Return early if response is not a dict
-                if not isinstance(response, dict):
-                    return {}
-
-                # Check for empty response
-                if all(isinstance(data, list) for data in response.values()):
-                    logger.debug(f"Empty response received (attempt {attempt + 1}/{self.MAX_RETRIES})")
-                    time.sleep(self.RETRY_DELAY)
-                    continue
-
-                return {
-                    infohash: self._filter_valid_containers(data.get("rd", []))
-                    for infohash, data in response.items()
-                    if isinstance(data, dict) and "rd" in data
-                }
-
-            except Exception as e:
-                logger.debug(f"Failed to get instant availability (attempt {attempt + 1}/{self.MAX_RETRIES}): {e}")
-                if attempt < self.MAX_RETRIES - 1:
-                    time.sleep(self.RETRY_DELAY)
-                continue
-
-        logger.debug("All retry attempts failed for instant availability")
-        return {}
-
-    def _filter_valid_containers(self, containers: List[dict]) -> List[dict]:
-        """Filter and sort valid video containers"""
-        valid_containers = [
-            container for container in containers
-            if self._contains_valid_video_files(container)
+                for file_id, file_info in torrent_info.files.items()
+            ) if file is not None
        ]
-        return sorted(valid_containers, key=len, reverse=True)
-
-    def _contains_valid_video_files(self, container: dict) -> bool:
-        """Check if container has valid video files"""
-        return all(
-            any(
-                file["filename"].endswith(ext) and "sample" not in file["filename"].lower()
-                for ext in VIDEO_EXTENSIONS
-            )
-            for file in container.values()
-        )

-    def add_torrent(self, infohash: str) -> str:
-        """
-        Add a torrent by infohash
-        Required by DownloaderBase
-        """
-        if not self.initialized:
-            raise RealDebridError("Downloader not properly initialized")
+        return TorrentContainer(infohash=infohash, files=torrent_files) if torrent_files else None


🛠️ Refactor suggestion

Improve method structure and status validation

The method could benefit from:

Breaking down the video file filtering logic into a separate method

Using an enum for status comparison instead of string literals

Adding a timeout for the status check

+ def _find_video_files(self, torrent_info: TorrentInfo) -> List[str]: + """Extract video file IDs from torrent info.""" + return [ + file_id for file_id, file_info in torrent_info.files.items() + if file_info["filename"].endswith(tuple(ext.lower() for ext in VIDEO_EXTENSIONS)) + ] def _process_torrent(self, torrent_id: str, infohash: str, item_type: str) -> Optional[TorrentContainer]: """Process a single torrent and return a TorrentContainer if valid.""" torrent_info = self.get_torrent_info(torrent_id) - if torrent_info.status == "waiting_files_selection": + if torrent_info.status == RDTorrentStatus.WAITING_FILES: - video_file_ids = [ - file_id for file_id, file_info in torrent_info.files.items() - if file_info["filename"].endswith(tuple(ext.lower() for ext in VIDEO_EXTENSIONS)) - ] + video_file_ids = self._find_video_files(torrent_info) if not video_file_ids: logger.debug(f"No video files found in torrent {torrent_id} with infohash {infohash}") return None self.select_files(torrent_id, video_file_ids) + + # Add timeout for status check + max_attempts = 3 + attempt = 0 + while attempt < max_attempts: + torrent_info = self.get_torrent_info(torrent_id) + if torrent_info.status == RDTorrentStatus.DOWNLOADED: + break + time.sleep(1) + attempt += 1 - torrent_info = self.get_torrent_info(torrent_id) - if torrent_info.status != "downloaded": + if torrent_info.status != RDTorrentStatus.DOWNLOADED: logger.debug(f"Torrent {torrent_id} with infohash {infohash} is not cached") return None

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-26T19:31:32Z

src/program/services/downloaders/alldebrid.py

 from datetime import datetime
 from typing import Dict, Iterator, List, Optional, Tuple

 from loguru import logger
 from requests import Session
 from requests.exceptions import ConnectTimeout

+from program.services.downloaders.models import (
+    VIDEO_EXTENSIONS,


🛠️ Refactor suggestion

Remove unused import VIDEO_EXTENSIONS

The VIDEO_EXTENSIONS import from program.services.downloaders.models is not used in this file. Removing it maintains a clean and maintainable codebase.

Apply this diff to remove the unused import:

from program.services.downloaders.models import ( - VIDEO_EXTENSIONS, DebridFile, TorrentContainer, TorrentInfo, )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

VIDEO_EXTENSIONS,

from program.services.downloaders.models import (

DebridFile,

TorrentContainer,

TorrentInfo,

)

🧰 Tools

🪛 Ruff (0.8.0)

10-10: program.services.downloaders.models.VIDEO_EXTENSIONS imported but unused

Remove unused import: program.services.downloaders.models.VIDEO_EXTENSIONS

(F401)

coderabbitai · 2024-11-26T19:31:32Z

src/program/services/downloaders/alldebrid.py

@@ -1,10 +1,17 @@
+import time
 from datetime import datetime
 from typing import Dict, Iterator, List, Optional, Tuple


🛠️ Refactor suggestion

Remove unused imports from typing module

The imports Dict, Iterator, and Tuple from the typing module are not used in this file. Removing unused imports helps keep the code clean and reduces clutter.

Apply this diff to remove the unused imports:

-from typing import Dict, Iterator, List, Optional, Tuple +from typing import List, Optional

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from typing import Dict, Iterator, List, Optional, Tuple

from typing import List, Optional

🧰 Tools

🪛 Ruff (0.8.0)

3-3: typing.Dict imported but unused

Remove unused import

(F401)

3-3: typing.Iterator imported but unused

Remove unused import

(F401)

3-3: typing.Tuple imported but unused

Remove unused import

(F401)

feat: enable TorBoxDownloader and refactor stream processing in Downl…

99424a4

…oader

dreulavelle reviewed Nov 22, 2024

View reviewed changes

refactor: downloaders refactor. wip

70f651b

dreulavelle changed the title ~~feat: enable TorBoxDownloader and refactor stream processing in Downloader~~ refactor: downloaders rework Nov 24, 2024

dreulavelle and others added 9 commits November 24, 2024 10:29

fix: fixed torbox id handling

994f021

fix: mediafusion incorrectly parsing titles. housekeeping downloader

1df3024

fix: add global timeout of 15s

62c33de

fix: fixed rd instantavail endpoint

bea85a3

fix: alldebrid

4f3bc51

fix: remove chunking on downloaders

805832d

fix: check download state before downloading. added bucket limit feature

1b467ca

fix: episode duplicate downloads

d407535

fix: alldebrid after instant endpoint removal

2b80d47

Gaisberg approved these changes Nov 26, 2024

View reviewed changes

Merge branch 'main' into feature/torbox

d25c01d

dreulavelle marked this pull request as ready for review November 26, 2024 19:11

coderabbitai bot reviewed Nov 26, 2024

View reviewed changes

chore: sorted imports and updated deps

15aa7d6

dreulavelle merged commit 602f732 into main Nov 26, 2024
1 of 2 checks passed

dreulavelle deleted the feature/torbox branch November 26, 2024 19:29

coderabbitai bot reviewed Nov 26, 2024

View reviewed changes

This was referenced Nov 27, 2024

fix: moved downloader proxy settings to parent instead of per debrid #914

Merged

[Fix]#921 #925 Manual scrapping integration fixed. #924

Open

fix: api manual scraping fixes #915

Merged

coderabbitai bot mentioned this pull request Dec 16, 2024

feat: Active ongoing episode releasing on the time of airing #942

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: downloaders rework #903

refactor: downloaders rework #903

davidemarcoli commented Nov 22, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 22, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

dreulavelle Nov 22, 2024 •

edited

Loading

davidemarcoli Nov 23, 2024

dreulavelle commented Nov 24, 2024

Gaisberg left a comment

coderabbitai bot left a comment

coderabbitai bot Nov 26, 2024

coderabbitai bot Nov 26, 2024

coderabbitai bot Nov 26, 2024

coderabbitai bot Nov 26, 2024

coderabbitai bot Nov 26, 2024

coderabbitai bot Nov 26, 2024

coderabbitai bot Nov 26, 2024

coderabbitai bot Nov 26, 2024

coderabbitai bot Nov 26, 2024

coderabbitai bot Nov 26, 2024

coderabbitai bot left a comment

coderabbitai bot Nov 26, 2024

coderabbitai bot Nov 26, 2024

coderabbitai bot Nov 26, 2024

coderabbitai bot Nov 26, 2024

coderabbitai bot Nov 26, 2024

coderabbitai bot Nov 26, 2024

coderabbitai bot Nov 26, 2024

coderabbitai bot Nov 26, 2024

-        self.timeout = 15
+        def __init__(self, session: Session | LimiterSession, response_type: ResponseType = ResponseType.SIMPLE_NAMESPACE, base_url: Optional[str] = None, base_params: Optional[BaseRequestParameters] = None,
+                custom_exception: Optional[Type[Exception]] = None, request_logging: bool = False, timeout: int = 15):
+         self.session = session
+         self.response_type = response_type
+         self.BASE_URL = base_url
+         self.BASE_REQUEST_PARAMS = base_params or BaseRequestParameters()
+         self.custom_exception = custom_exception or Exception
+         self.request_logging = request_logging
+         self.timeout = timeout

	from typing import Dict, Iterator, List, Optional, Tuple
	from typing import List, Optional

refactor: downloaders rework #903

refactor: downloaders rework #903

Conversation

davidemarcoli commented Nov 22, 2024 • edited by coderabbitai bot Loading

Pull Request Check List

Description:

Summary by CodeRabbit

coderabbitai bot commented Nov 22, 2024 • edited Loading

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Suggested reviewers

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

dreulavelle Nov 22, 2024 • edited Loading

Choose a reason for hiding this comment

davidemarcoli Nov 23, 2024

Choose a reason for hiding this comment

dreulavelle commented Nov 24, 2024

Gaisberg left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 26, 2024

Choose a reason for hiding this comment

davidemarcoli commented Nov 22, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 22, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

dreulavelle Nov 22, 2024 •

edited

Loading