Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LCFS - Migrate Schedule C Records from TFRS to Other Uses in LCFS with Version Chaining #1556

Open
21 tasks
AlexZorkin opened this issue Dec 21, 2024 · 0 comments
Open
21 tasks
Labels
Medium Medium priority Task Work that does not directly impact the user

Comments

@AlexZorkin
Copy link
Collaborator

Description:
Develop a Groovy ETL script to migrate Schedule C records from TFRS to the other_uses table in LCFS. Each TFRS compliance report includes a full set of Schedule C records, and supplemental reports may contain modified versions. The script will process compliance reports, identify supplemental chains, and compare Schedule C records to establish accurate version chains in LCFS using group_uuid and incremental version.

Key tasks include:

  • Loop through compliance reports in TFRS and extract Schedule C records.
  • Identify supplemental chains and compare Schedule C records for changes (added, updated, or removed).
  • Map TFRS fields to LCFS fields, including handling expected_use_id, rationale, and quantity_supplied.
  • Populate the other_uses table in LCFS, linking records to their respective compliance reports using the legacy_id field.

Purpose and Benefit to User:
Accurately migrates "other uses" data, preserving versioning and tracking changes across supplemental compliance reports. This ensures a complete and auditable record of non-credit-earning fuel uses, aligned with compliance requirements in LCFS.

Acceptance Criteria:

  • Given I am a developer, when the ETL process runs, all Schedule C records from TFRS are migrated to LCFS with correct associations to their compliance reports using legacy_id.
  • Given I am a developer, when the process runs, other_uses records are grouped by group_uuid and incremental versions are assigned for supplemental reports.
  • Given I am a developer, when the process runs, changes between supplemental reports are accurately identified and reflected in the version chains.
  • Given I am a developer, when the process runs, all required fields (quantity_supplied, rationale, expected_use_id, etc.) are correctly mapped to LCFS.
  • Given I am running the ETL process, errors are logged, and processing continues without halting for other compliance reports.

Development Checklist:

  1. Data Extraction:

    • Query TFRS compliance reports to extract Schedule C records grouped by compliance report ID.
    • Identify supplemental chains for each compliance report.
  2. Mapping and Comparison:

    • Compare Schedule C records within each chain to detect changes (e.g., added, modified, or removed records).
    • Map TFRS fields to LCFS other_uses fields:
      • quantityquantity_supplied
      • expected_use_idexpected_use_id
      • rationalerationale
      • fuel_type_idfuel_type_id
      • fuel_class_idfuel_category_id
      • provision_of_the_act_idprovision_of_the_act_id
  3. Version Chaining:

    • Assign a group_uuid to all records within a compliance report chain.
    • Increment version for each supplemental report in the chain.
    • Set action_type based on the type of change (e.g., CREATE, UPDATE, or DELETE).
  4. Data Insertion:

    • Insert mapped records into the other_uses table with the correct version and compliance report associations via legacy_id.
    • Ensure other_uses records link to the correct compliance report in LCFS.
  5. Error Handling and Logging:

    • Implement robust error handling to log issues without interrupting the ETL process for other records.
    • Log comparisons and detected changes between supplemental reports.
  6. Testing and Validation:

    • Verify that all records are migrated accurately, including version chaining.
    • Validate associations with compliance reports in LCFS.
    • Test edge cases, such as compliance reports with multiple supplemental versions or significant changes in Schedule C records.
  7. Documentation:

    • Document the script’s purpose, logic, and field mappings for future reference and maintenance.
    • Provide detailed guidance for running and verifying the ETL process.

Notes:

  • The legacy_id field in LCFS compliance reports will be used to match the TFRS compliance report ID for associations.
  • Ensure expected_use_id is correctly mapped to LCFS expected use types.
  • Leverage existing ETL scripts for compliance reports and fuel supplies as references for structuring and implementing version chaining.
@AlexZorkin AlexZorkin added Medium Medium priority Task Work that does not directly impact the user labels Dec 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Medium Medium priority Task Work that does not directly impact the user
Projects
None yet
Development

No branches or pull requests

1 participant