LCFS - Insert Legacy Fuel Types and Related Data #1551

AlexZorkin · 2024-12-20T23:40:28Z

Description:
Add new entries for legacy fuel types derived from TFRS-approved fuels. For each legacy variant, insert a fuel_type record with is_legacy = true and all related defaults (energy density, energy effectiveness ratios, carbon intensities, and associated units of measure), ensuring legacy data remains distinct from current records. This migration will transfer real values from the production TFRS lookup tables into the LCFS system through Alembic migrations, maintaining data integrity and consistency.

Purpose and benefit to user:
Allows the system to accurately represent historical or legacy fuels, enabling comparisons and analysis across multiple time periods or regulatory regimes. Users can perform comprehensive historical reporting and ensure consistency between TFRS and LCFS datasets without data duplication issues.

Acceptance Criteria:

Given I am a data migration engineer, when I run the Alembic migration, then all specified legacy fuel types from TFRS appear in the fuel_type table with is_legacy = true.
Given I am a data analyst, when I review the energy_density and energy_effectiveness_ratio tables, then corresponding legacy entries exist and correctly link to their respective legacy fuel_type IDs.
Given the migration is complete, when validating data integrity, then all legacy fuel types have their associated default values accurately reflecting the production TFRS lookup table values.
Given legacy fuel types share names with current fuels, when querying the fuel_type table, then duplicates are correctly identified and distinguished by the is_legacy flag.

Development Checklist:

Preparation:
- Review and extract legacy fuel types from the production TFRS approved_fuel_type table.
- Identify which legacy fuel types already exist in LCFS and determine if they require duplication with is_legacy = true.
- Gather corresponding default values for each legacy fuel type, including energy_density, energy_effectiveness_ratio, carbon_intensity, and unit_of_measure.
Alembic Migration Script Creation:
- Create a new Alembic revision for inserting legacy fuel types.
- Define the upgrade function to:
  - Insert new rows into the fuel_type table with is_legacy = true for each legacy fuel type.
  - Insert corresponding entries into related tables (energy_density, energy_effectiveness_ratio, etc.) linked to the newly inserted legacy fuel_type IDs.
- Define the downgrade function to:
  - Remove the inserted legacy fuel types and their related entries, ensuring rollback capability.

Example Alembic Migration Script:

from alembic import op
import sqlalchemy as sa
from sqlalchemy.sql import table, column
from sqlalchemy import Integer, Text, Boolean, Numeric, Enum

# Define tables for insertions
fuel_type_table = table('fuel_type',
    column('fuel_type_id', Integer),
    column('fuel_type', Text),
    column('fossil_derived', Boolean),
    column('other_uses_fossil_derived', Boolean),
    column('default_carbon_intensity', Numeric),
    column('units', Enum),
    column('unrecognized', Boolean),
    column('is_legacy', Boolean)
)

energy_density_table = table('energy_density',
    column('fuel_type_id', Integer),
    column('density', Numeric),
    column('uom_id', Integer)
)

energy_effectiveness_ratio_table = table('energy_effectiveness_ratio',
    column('fuel_category_id', Integer),
    column('fuel_type_id', Integer),
    column('end_use_type_id', Integer),
    column('ratio', Numeric)
)

def upgrade():
    # Insert legacy fuel types
    op.bulk_insert(fuel_type_table,
        [
            {
                'fuel_type': 'Biodiesel',
                'fossil_derived': False,
                'other_uses_fossil_derived': False,
                'default_carbon_intensity': 100.21,
                'units': 'Litres',
                'unrecognized': False,
                'is_legacy': True
            },
            {
                'fuel_type': 'CNG',
                'fossil_derived': False,
                'other_uses_fossil_derived': True,
                'default_carbon_intensity': 63.91,
                'units': 'Cubic_metres',
                'unrecognized': False,
                'is_legacy': True
            },
            {
                'fuel_type': 'Electricity',
                'fossil_derived': False,
                'other_uses_fossil_derived': True,
                'default_carbon_intensity': 12.14,
                'units': 'Kilowatt_hour',
                'unrecognized': False,
                'is_legacy': True
            },
            # Add additional legacy fuel types here...
        ]
    )

    # Retrieve the newly inserted fuel_type_ids
    # This typically requires selecting based on unique constraints or using RETURNING in databases that support it.
    # For simplicity, assuming fuel_type_id is auto-incremented and follows the order of insertion.

    # Example fuel_type_id assignments (these IDs should be dynamically retrieved in a real scenario)
    legacy_fuel_ids = {
        'Biodiesel': 101,
        'CNG': 102,
        'Electricity': 103,
        # Add additional mappings here...
    }

    # Insert into energy_density
    op.bulk_insert(energy_density_table,
        [
            {'fuel_type_id': legacy_fuel_ids['Biodiesel'], 'density': 35.40, 'uom_id': 1},
            {'fuel_type_id': legacy_fuel_ids['CNG'], 'density': 38.27, 'uom_id': 3},
            {'fuel_type_id': legacy_fuel_ids['Electricity'], 'density': 3.60, 'uom_id': 2},
            # Add additional energy_density entries here...
        ]
    )

    # Insert into energy_effectiveness_ratio
    op.bulk_insert(energy_effectiveness_ratio_table,
        [
            {'fuel_category_id': 2, 'fuel_type_id': legacy_fuel_ids['Biodiesel'], 'end_use_type_id': None, 'ratio': 1.0},
            {'fuel_category_id': 3, 'fuel_type_id': legacy_fuel_ids['CNG'], 'end_use_type_id': 4, 'ratio': 3.8},
            {'fuel_category_id': 3, 'fuel_type_id': legacy_fuel_ids['Electricity'], 'end_use_type_id': 5, 'ratio': 3.2},
            # Add additional energy_effectiveness_ratio entries here...
        ]
    )

def downgrade():
    # Delete inserted energy_effectiveness_ratio entries
    op.execute("""
        DELETE FROM energy_effectiveness_ratio
        WHERE fuel_type_id IN (101, 102, 103)
    """)

    # Delete inserted energy_density entries
    op.execute("""
        DELETE FROM energy_density
        WHERE fuel_type_id IN (101, 102, 103)
    """)

    # Delete inserted fuel_type entries
    op.execute("""
        DELETE FROM fuel_type
        WHERE is_legacy = true AND fuel_type IN ('Biodiesel', 'CNG', 'Electricity')
    """)

Notes:

The above script assumes specific fuel_type_id values for legacy entries (101, 102, 103, etc.). In a real-world scenario, retrieve these IDs dynamically after insertion using RETURNING or other database-specific features.
Ensure that all related tables (energy_density, energy_effectiveness_ratio, etc.) are correctly referenced with the appropriate fuel_type_id.
Extend the bulk_insert lists with all necessary legacy fuel types and their associated default values as per the production TFRS data.
Validate that the units field corresponds to the correct QuantityUnitsEnum values defined in the ORM model.

Data Transfer Validation and Analysis:
- After running the migration, execute queries to ensure all legacy fuel types are present with is_legacy = true.
- Cross-verify the inserted energy_density and energy_effectiveness_ratio records against the production TFRS lookup tables to ensure accuracy.
- Perform sample checks on several legacy fuel types to confirm that all related defaults are correctly linked and reflect the intended legacy values.
- Utilize data validation scripts or tools to automate the comparison between TFRS source data and the migrated LCFS data.
- Document any discrepancies found during validation and address them promptly.

Notes:

Legacy fuels may share names with current fuels; duplicates are allowed as long as the is_legacy flag distinguishes them.
Ensure that all unit_of_measure references align with the QuantityUnitsEnum defined in the ORM models.
Coordinate with the business area team to verify that all legacy default values are accurately captured and inserted.
Consider implementing logging within the migration script to track inserted records and facilitate troubleshooting if needed.

The text was updated successfully, but these errors were encountered:

AlexZorkin added Medium Medium priority Task Work that does not directly impact the user labels Dec 20, 2024

dhaselhan self-assigned this Jan 6, 2025

dhaselhan added the Added Banana 4 Scale Label for tickets added mid-sprint label Jan 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LCFS - Insert Legacy Fuel Types and Related Data #1551

LCFS - Insert Legacy Fuel Types and Related Data #1551

AlexZorkin commented Dec 20, 2024

LCFS - Insert Legacy Fuel Types and Related Data #1551

LCFS - Insert Legacy Fuel Types and Related Data #1551

Comments

AlexZorkin commented Dec 20, 2024