Adding logging for Hydrography scripts #31

varun-andhra-mapup · 2024-07-30T19:31:56Z

PR Type

enhancement, other

Description

Added logger parameter to multiple functions across various scripts to replace print statements with logging.
Initialized logging configuration in run-hydrography-pipeline.py.
Enhanced logging for better traceability and debugging.

Changes walkthrough 📝

Relevant files

Enhancement

determine_final_osm_id.py `Add logging to association functions` hydrography-approach/processing_scripts/associate_data/determine_final_osm_id.py Added `logger` parameter to multiple functions. Replaced `print` statements with `logger.info` for logging.	+12/-11
exclude_nearby_bridges.py `Add logging to exclude nearby bridges` hydrography-approach/processing_scripts/associate_data/exclude_nearby_bridges.py Added `logger` parameter to functions. Replaced `print` statements with `logger.info`.	+6/-7
join_all_data.py `Add logging to join all data script` hydrography-approach/processing_scripts/associate_data/join_all_data.py Added `logger` parameter to `process_all_join` function. Replaced `print` statements with `logger.info`.	+2/-2
filter_osm_ways.py `Add logging to filter OSM ways` hydrography-approach/processing_scripts/filter_data/filter_osm_ways.py Added `logger` parameter to `filter_ways` function. Replaced `print` statements with `logger.info`.	+2/-2
process_filter_nbi_bridges.py `Add logging to filter NBI bridges` hydrography-approach/processing_scripts/filter_data/process_filter_nbi_bridges.py Added `logger` parameter to functions. Replaced `print` statements with `logger.info`.	+4/-4
tag_nbi_and_osm_data.py `Add logging to tag NBI and OSM data` hydrography-approach/processing_scripts/tag_data/tag_nbi_and_osm_data.py Added `logger` parameter to multiple functions. Replaced `print` statements with `logger.info` and `logger.error`.	+31/-20
run-hydrography-pipeline.py `Add logging to hydrography pipeline script` hydrography-approach/run-hydrography-pipeline.py Initialized logging configuration. Replaced `print` statements with `logging.info`. Passed `logger` to various processing functions.	+27/-15

💡 PR-Agent usage:
Comment /help on the PR to get a list of all available PR-Agent tools and their descriptions

github-actions · 2024-07-30T19:33:07Z

PR Reviewer Guide 🔍

(Review updated until commit `c221f25`)

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Key issues to review Error Handling The function `load_layers` uses `sys.exit(1)` for error handling which might not be the best approach in a larger application context. Consider raising exceptions that can be caught and handled at a higher level to provide more control over the application's flow. Exception Handling The main function catches general exceptions and logs them, which is good. However, it might be beneficial to handle specific exceptions more granularly to provide clearer error messages and recovery options.

github-actions · 2024-07-30T19:33:09Z

PR Code Suggestions ✨

Category	Suggestion	Score
Enhancement	Add error handling for CSV writing operations to improve robustness Consider adding error handling for the CSV writing operation to ensure the program handles potential I/O errors gracefully. hydrography-approach/processing_scripts/associate_data/determine_final_osm_id.py [202-203] -df.to_csv(intermediate_association) -logger.info(f"{intermediate_association} file has been created successfully!") +try: + df.to_csv(intermediate_association) + logger.info(f"{intermediate_association} file has been created successfully!") +except IOError as e: + logger.error(f"Failed to write {intermediate_association}: {str(e)}") Suggestion importance[1-10]: 9 Why: Adding error handling for I/O operations is crucial for robustness, especially when dealing with file operations that can fail due to various reasons such as permission issues or disk space limitations.	9
	Add exception handling for robust error management For better error handling and debugging, consider adding exception handling around critical operations such as file operations and data processing. hydrography-approach/run-hydrography-pipeline.py [48] -config = load_config(state_name) +try: + config = load_config(state_name) +except Exception as e: + logger.error(f"Failed to load configuration: {e}") + raise Suggestion importance[1-10]: 9 Why: Adding exception handling around critical operations improves the robustness and debuggability of the code, making it easier to identify and handle errors.	9
	Use a more descriptive logger name Consider using a more specific logger name instead of `name` to help differentiate logs from different parts of the application more easily. hydrography-approach/run-hydrography-pipeline.py [23] -logger = logging.getLogger(__name__) +logger = logging.getLogger("hydrography_pipeline") Suggestion importance[1-10]: 7 Why: Using a more descriptive logger name can help differentiate logs from different parts of the application, improving maintainability and debugging.	7
Best practice	Use exceptions instead of sys.exit for error handling Replace the direct sys.exit calls with exceptions to allow for better error handling and testing. hydrography-approach/processing_scripts/associate_data/determine_final_osm_id.py [208-209] if not nbi_points_gl.isValid(): logger.error("NBI points layer failed to load!") - sys.exit(1) + raise Exception("Failed to load NBI points layer") Suggestion importance[1-10]: 8 Why: Replacing sys.exit with exceptions allows for better error handling and makes the code more testable. This change improves the overall robustness and maintainability of the code.	8
	Avoid hardcoding file paths in the code It is recommended to avoid hardcoding file paths in the logging configuration. Consider using a configuration file or environment variable to manage the log file path. hydrography-approach/run-hydrography-pipeline.py [19] -filename="hydrography-pipeline.log", +filename=os.getenv("LOG_FILE_PATH", "hydrography-pipeline.log"), Suggestion importance[1-10]: 8 Why: Using environment variables for file paths enhances flexibility and security, making the code more adaptable to different environments.	8
	Use structured logging for consistent log format Ensure consistent logging format by using structured logging for better traceability and analysis. hydrography-approach/processing_scripts/associate_data/determine_final_osm_id.py [203] -logger.info(f"{intermediate_association} file has been created successfully!") +logger.info("File creation successful", extra={"file_name": intermediate_association}) Suggestion importance[1-10]: 7 Why: Structured logging improves log readability and traceability, which is beneficial for debugging and monitoring. However, it is a minor enhancement compared to error handling.	7
	Ensure proper file handling by using a context manager Use a context manager to ensure the CSV file is closed properly after operations, which is especially important in a loop or when handling large data. hydrography-approach/processing_scripts/associate_data/determine_final_osm_id.py [202] -df.to_csv(intermediate_association) +with open(intermediate_association, 'w', newline='') as file: + df.to_csv(file) Suggestion importance[1-10]: 6 Why: Using a context manager ensures that the file is properly closed after the operation, which is good practice. However, in the context of using pandas' to_csv method, this is less critical as pandas handles file opening and closing internally.	6
Possible issue	Ensure directories exist before use To ensure that directories are created before they are used, consider checking and creating necessary directories at the start of the script or function. hydrography-approach/run-hydrography-pipeline.py [52-54] -os.makedirs( - config["output_data_folders"]["state_folder"], - exist_ok=True, +if not os.path.exists(config["output_data_folders"]["state_folder"]): + os.makedirs(config["output_data_folders"]["state_folder"]) Suggestion importance[1-10]: 3 Why: The existing code already ensures the directory exists using `exist_ok=True`, making this suggestion redundant and less impactful.	3

hydrography-approach/run-hydrography-pipeline.py

varun-andhra-mapup · 2024-08-01T06:57:18Z

Fixed logging configuration

Updated the logging configuration to use a path specified in the config.yml file, replacing the hardcoded path. This enhances flexibility and ensures consistent logging across different environments.

Commit:

e05fbc2

Code Improvement

def main() -> None:
    ...
    # Configure logging after loading config
    log_file_path = config["logging"].get("log_file_path", "hydrography-pipeline.log")
    logging.basicConfig(
        filename=log_file_path,  # Use the path from the configuration
        level=logging.INFO,
        format="%(asctime)s - [%(levelname)s] - (%(filename)s).%(funcName)s - %(message)s",
    )
    ...

varun-andhra-mapup · 2024-08-01T07:08:59Z

Error Handling for Layer Loading

Improved error handling for layer loading by adding detailed logging and maintaining process integrity. The function now logs specific errors if layers fail to load and exits immediately, as retries are not practical given the likely file path issues. Continuing without these layers is not possible since subsequent processes depend on their successful loading.

Commit

16da213

Code Improvement

from typing import Optional, Tuple
from qgis.core import QgsVectorLayer

def load_layers(nbi_points_fp: str, osm_fp: str, logger) -> Tuple[Optional[QgsVectorLayer], Optional[QgsVectorLayer]]:
    """
    Load required layers with improved error handling.

    Args:
        nbi_points_fp (str): File path to the NBI points layer.
        osm_fp (str): File path to the OSM ways layer.
        logger: Logger instance for logging errors.

    Returns:
        Tuple[Optional[QgsVectorLayer], Optional[QgsVectorLayer]]:
            A tuple containing the NBI points layer and the OSM ways layer.
    """
    def load_layer(fp: str, layer_name: str) -> Optional[QgsVectorLayer]:
        """
        Load a layer and log errors if the loading fails.

        Args:
            fp (str): File path to the layer.
            layer_name (str): Name of the layer for logging purposes.

        Returns:
            Optional[QgsVectorLayer]: Loaded layer or None if failed.
        """
        layer = QgsVectorLayer(fp, layer_name, "ogr")
        if not layer.isValid():
            logger.error(f"{layer_name} layer failed to load. Check the file path and ensure the file exists.")
            return None
        return layer

    nbi_points_gl = load_layer(nbi_points_fp, "nbi-points")
    osm_gl = load_layer(osm_fp, "filtered")

    if nbi_points_gl is None:
        logger.error("NBI points layer is critical and could not be loaded. Exiting.")
        sys.exit(1)

    if osm_gl is None:
        logger.error("OSM ways layer could not be loaded. Exiting.")
        sys.exit(1)

    return nbi_points_gl, osm_gl

varun-andhra-mapup · 2024-08-01T07:16:28Z

Improved Error Handling for CSV Writing

Enhanced the robustness of the CSV file creation process by adding error handling for potential I/O errors. Previously, if there was an issue writing the CSV file, the application would fail silently without providing any indication of the failure.

Commit

16da213

Code Improvement

Added a try-except block around the df.to_csv method to catch and log IOError exceptions.

try:
    df.to_csv(intermediate_association)
    logger.info(f"{intermediate_association} file has been created successfully!")
except IOError as e:
    logger.error(f"Failed to write {intermediate_association}: {str(e)}")

varun-andhra-mapup · 2024-08-01T09:27:22Z

Ensure directory existence before use

This change adds a check to see if directories already exist before attempting to create them. It also includes logging for both successful directory creation and pre-existing directories.

Commit

c221f25

Code Improvement

Implemented os.path.exists() to check directory existence.
Added logging for directory creation and pre-existing directories.
Improved error handling with try-except blocks.

def create_directories(config: Dict[str, str]) -> None:
    """
    Create directories for output data as specified in the configuration.
    
    :param config: Dictionary containing paths for directories to be created.
    """
    output_folders = config.get("output_data_folders", {})

    # Define required folder keys
    required_folders = ["state_folder", "csv_files", "gpkg_files", "pbf_files"]

    for folder in required_folders:
        folder_path = output_folders.get(folder)
        if folder_path:
            if not os.path.exists(folder_path):
                try:
                    os.makedirs(folder_path)
                    logger.info(f"Directory created: {folder_path}")
                except Exception as e:
                    logger.error(f"Failed to create directory {folder_path}: {e}")
            else:
                logger.info(f"Directory already exists: {folder_path}")
        else:
            logger.warning(f"Path for {folder} not specified in configuration.")

This ensures directories are only created if they do not already exist and provides better feedback during execution.

varun-andhra-mapup · 2024-08-01T09:33:58Z

/review

github-actions · 2024-08-01T09:34:49Z

Persistent review updated to latest commit c221f25

varun-andhra-mapup · 2024-08-05T07:53:06Z

Improved Error Handling in `load_layers`

The load_layers function now incorporates a custom LayerLoadError exception to handle errors more gracefully. This change replaces the use of sys.exit(1) with exceptions that can be caught and managed at a higher level, offering better control over the application flow.

Commit

d8fc364

Code Improvement

class LayerLoadError(Exception):
    """Custom exception for errors in loading vector layers."""

    def __init__(self, layer_name: str, message: str):
        super().__init__(f"{layer_name} layer: {message}")
        self.layer_name = layer_name


def load_layers(
    nbi_points_fp: str, osm_fp: str, logger: logging.Logger
) -> Tuple[Optional[QgsVectorLayer], Optional[QgsVectorLayer]]:
    """
    Load required layers with improved error handling.
    Tuple[Optional[QgsVectorLayer], Optional[QgsVectorLayer]]:
        A tuple containing the NBI points layer and the OSM ways layer.
    """

    def load_layer(fp: str, layer_name: str) -> Optional[QgsVectorLayer]:
        """
        Load a layer and log errors if the loading fails.
        """
        layer = QgsVectorLayer(fp, layer_name, "ogr")
        if not layer.isValid():
            logger.error(
                f"{layer_name} layer failed to load. Check the file path and ensure the file exists."
            )
            raise LayerLoadError(layer_name, "Failed to load layer.")
        return layer

    try:
        nbi_points_gl = load_layer(nbi_points_fp, "nbi-points")
        osm_gl = load_layer(osm_fp, "filtered")
    except LayerLoadError as e:
        logger.error(e)
        raise  # Re-raise the exception to be handled at a higher level

hydrography-approach/processing_scripts/filter_data/filter_osm_ways.py

hydrography-approach/processing_scripts/associate_data/determine_final_osm_id.py

hydrography-approach/processing_scripts/filter_data/process_filter_nbi_bridges.py

Adding logging for Hydrography scripts

187a807

github-actions bot added enhancement New feature or request other Review effort [1-5]: 3 labels Jul 30, 2024

maneesh-mapup requested changes Jul 31, 2024

View reviewed changes

hydrography-approach/run-hydrography-pipeline.py Outdated Show resolved Hide resolved

varun-andhra-mapup added 2 commits July 31, 2024 23:38

Updated logging configuration

c3fc04d

Logging with config file

e05fbc2

maneesh-mapup requested changes Jul 31, 2024

View reviewed changes

hydrography-approach/run-hydrography-pipeline.py Show resolved Hide resolved

Error logging for loading layers, exception handling for CSV storing

16da213

varun-andhra-mapup added 2 commits August 1, 2024 13:00

Exception handling for loading config

dfd8630

Add directory existence check and logging

c221f25

varun-andhra-mapup requested a review from pushpak-gote-mapup August 2, 2024 13:36

Refactor error handling in load_layers function

d8fc364

varun-andhra-mapup requested a review from nitin-khandagale-mapup August 5, 2024 06:18

nitin-khandagale-mapup requested changes Aug 5, 2024

View reviewed changes

hydrography-approach/processing_scripts/filter_data/filter_osm_ways.py Outdated Show resolved Hide resolved

hydrography-approach/processing_scripts/associate_data/determine_final_osm_id.py Show resolved Hide resolved

pushpak-gote-mapup requested changes Aug 5, 2024

View reviewed changes

hydrography-approach/processing_scripts/filter_data/process_filter_nbi_bridges.py Outdated Show resolved Hide resolved

Added exception handling and type hints

5bea9b7

varun-andhra-mapup requested review from pushpak-gote-mapup and nitin-khandagale-mapup August 8, 2024 09:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding logging for Hydrography scripts #31

Adding logging for Hydrography scripts #31

varun-andhra-mapup commented Jul 30, 2024 •

edited by github-actions bot

Loading

github-actions bot commented Jul 30, 2024 •

edited

Loading

github-actions bot commented Jul 30, 2024 •

edited

Loading

varun-andhra-mapup commented Aug 1, 2024

varun-andhra-mapup commented Aug 1, 2024 •

edited

Loading

varun-andhra-mapup commented Aug 1, 2024 •

edited

Loading

varun-andhra-mapup commented Aug 1, 2024 •

edited

Loading

varun-andhra-mapup commented Aug 1, 2024

github-actions bot commented Aug 1, 2024

varun-andhra-mapup commented Aug 5, 2024

Adding logging for Hydrography scripts #31

Are you sure you want to change the base?

Adding logging for Hydrography scripts #31

Conversation

varun-andhra-mapup commented Jul 30, 2024 • edited by github-actions bot Loading

PR Type

Description

Changes walkthrough 📝

github-actions bot commented Jul 30, 2024 • edited Loading

PR Reviewer Guide 🔍

(Review updated until commit c221f25)

github-actions bot commented Jul 30, 2024 • edited Loading

PR Code Suggestions ✨

varun-andhra-mapup commented Aug 1, 2024

Fixed logging configuration

Commit:

Code Improvement

varun-andhra-mapup commented Aug 1, 2024 • edited Loading

Error Handling for Layer Loading

Commit

Code Improvement

varun-andhra-mapup commented Aug 1, 2024 • edited Loading

Improved Error Handling for CSV Writing

Commit

Code Improvement

varun-andhra-mapup commented Aug 1, 2024 • edited Loading

Ensure directory existence before use

Commit

Code Improvement

varun-andhra-mapup commented Aug 1, 2024

github-actions bot commented Aug 1, 2024

varun-andhra-mapup commented Aug 5, 2024

Improved Error Handling in load_layers

Commit

Code Improvement

varun-andhra-mapup commented Jul 30, 2024 •

edited by github-actions bot

Loading

github-actions bot commented Jul 30, 2024 •

edited

Loading

(Review updated until commit `c221f25`)

github-actions bot commented Jul 30, 2024 •

edited

Loading

varun-andhra-mapup commented Aug 1, 2024 •

edited

Loading

varun-andhra-mapup commented Aug 1, 2024 •

edited

Loading

varun-andhra-mapup commented Aug 1, 2024 •

edited

Loading

Improved Error Handling in `load_layers`