Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/issue 235 - Track ingest table can be populated with granules that aren't loaded into Hydrocron #245

Draft
wants to merge 5 commits into
base: develop
Choose a base branch
from

Conversation

nikki-t
Copy link
Collaborator

@nikki-t nikki-t commented Oct 9, 2024

Github Issue: #235

Description

Track ingest table can be populated with granules that aren't loaded into Hydrocron and the wrong collection can be searched if the collection shortname and table names do not match.

Overview of work done

  • Added a conditional test in track_ingest.py to test that the collection shortname, Hydrocron SWOT table, Hydrocron track ingest table match.
  • Added a conditional test in load_data.py to prevent the Load Granule Lambda function from loading Observed and Unassigned lakes.

Overview of verification done

  • Added two unit tests to test for raised errors from the above modifications.

Overview of integration done

Deployed feature branch to SIT and ran tests on reaches, nodes, and prior lakes.

LOAD DATA TESTS

Observed Lake Test Event (Raise error)

{
  "body": {
    "granule_path": "s3://podaac-swot-sit-cumulus-protected/SWOT_L2_HR_LakeSP_2.0/SWOT_L2_HR_LakeSP_Obs_020_150_EU_20240825T234434_20240825T235245_PIC0_01.zip",
    "table_name": "hydrocron-swot-prior-lake-table",
    "load_benchmarking_data": "False",
    "track_table": "hydrocron-swot-prior-lake-track-ingest-table"
  }
}

Logs

2024-10-09T20:04:22.008Z [ERROR] TableMisMatch: Error: Cannot load Observed or Unassigned Lake data into table: 'hydrocron-swot-prior-lake-table'
Traceback (most recent call last):
  File "/var/task/hydrocron/db/load_data.py", line 137, in granule_handler
    raise TableMisMatch(f"Error: Cannot load Observed or Unassigned Lake data into table: '{table_name}'")

Prior Lake Test Event (Normal operations)

{
  "body": {
    "granule_path": "s3://podaac-swot-sit-cumulus-protected/SWOT_L2_HR_LakeSP_2.0/SWOT_L2_HR_LakeSP_Prior_010_221_AF_20240201T211344_20240201T212928_PIC0_01.zip",
    "table_name": "hydrocron-swot-prior-lake-table",
    "load_benchmarking_data": "False",
    "track_table": "hydrocron-swot-prior-lake-track-ingest-table"
  }
}

Logs

2024-10-09T20:10:10.625Z [INFO] 2024-10-09T20:10:10.625Z Adding track ingest prior lakes items to table individually 
2024-10-09T20:10:10.625Z [INFO] 2024-10-09T20:10:10.625Z Item granuleUR: SWOT_L2_HR_LakeSP_Prior_010_221_AF_20240201T211344_20240201T212928_PIC0_01.zip 
2024-10-09T20:10:10.638Z [INFO] 2024-10-09T20:10:10.638Z Begin loading data from granule: SWOT_L2_HR_LakeSP_Prior_010_221_AF_20240201T211344_20240201T212928_PIC0_01.zip 
2024-10-09T20:10:10.638Z [INFO] 2024-10-09T20:10:10.638Z Set up dynamo table connection 
2024-10-09T20:10:10.669Z [INFO] 2024-10-09T20:10:10.669Z Batch adding 6872 prior_lake items. First 5 feature ids in batch: 

Track Ingest Mismatch Event (Raise error)

{
  "collection_shortname": "SWOT_L2_HR_RiverSP_reach_2.0",
  "hydrocron_table": "hydrocron-swot-reach-table",
  "hydrocron_track_table": "hydrocron-swot-node-track-ingest-table",
  "temporal": "",
  "query_start": "2024-09-13T01:00:00",
  "query_end": "2024-09-13T04:00:00"
}

Logs

2024-10-09T20:26:36.459Z [ERROR] TableMisMatch: Error: Cannot query reach data for tables: 'hydrocron-swot-reach-table' and 'hydrocron-swot-node-track-ingest-table'
Traceback (most recent call last):
  File "/var/task/hydrocron/db/track_ingest.py", line 365, in track_ingest_handler
    raise TableMisMatch(f"Error: Cannot query reach data for tables: '{hydrocron_table}' and '{hydrocron_track_table}'")

Track Ingest Event (Normal operations)

{
  "collection_shortname": "SWOT_L2_HR_RiverSP_reach_2.0",
  "hydrocron_table": "hydrocron-swot-reach-table",
  "hydrocron_track_table": "hydrocron-swot-reach-track-ingest-table",
  "temporal": "",
  "query_start": "2024-09-13T01:00:00",
  "query_end": "2024-09-13T04:00:00"
}

Logs

2024-10-09T20:42:30.982Z [INFO] 2024-10-09T20:42:30.982Z Querying CMR temporal range: 2024-09-13 01:00:00+00:00 to 2024-09-13 04:00:00+00:00. 
2024-10-09T20:42:33.780Z [INFO] 2024-10-09T20:42:33.780Z Located 5 granules in CMR. 
2024-10-09T20:42:33.966Z [INFO] 2024-10-09T20:42:33.965Z Located 5 granules NOT in Hydrocron. 
2024-10-09T20:42:33.980Z [INFO] 2024-10-09T20:42:33.980Z Located 0 granules with 'to_ingest' status. 
2024-10-09T20:42:33.980Z [INFO] 2024-10-09T20:42:33.980Z Located 5 granules that require ingestion. 
2024-10-09T20:42:33.980Z [INFO] 2024-10-09T20:42:33.980Z Located 0 granules that are already ingested.
...
2024-10-09T20:42:38.080Z [INFO] 2024-10-09T20:42:38.080Z To Ingest: [{'granuleUR': 'SWOT_L2_HR_RiverSP_Reach_021_071_AR_20240913T012106_20240913T012111_PIC0_01.zip', 'revision_date': '2024-09-17T22:05:42.612Z', 'checksum': 'e31692ed41a407435d20c100e05c4b83', 'expected_feature_count': -1, 'actual_feature_count': 0, 'status': 'to_ingest'}, {'granuleUR': 'SWOT_L2_HR_RiverSP_Reach_021_072_GR_20240913T013154_20240913T013158_PIC0_01.zip', 'revision_date': '2024-09-17T22:05:41.867Z', 'checksum': 'e31692ed41a407435d20c100e05c4b83', 'expected_feature_count': -1, 'actual_feature_count': 0, 'status': 'to_ingest'}, {'granuleUR': 'SWOT_L2_HR_RiverSP_Reach_021_074_AR_20240913T031404_20240913T031411_PIC0_01.zip', 'revision_date': '2024-09-17T22:05:42.420Z', 'checksum': 'e31692ed41a407435d20c100e05c4b83', 'expected_feature_count': -1, 'actual_feature_count': 0, 'status': 'to_ingest'}, {'granuleUR': 'SWOT_L2_HR_RiverSP_Reach_021_074_NA_20240913T032024_20240913T032032_PIC0_01.zip', 'revision_date': '2024-09-17T22:05:43.438Z', 'checksum': 'e31692ed41a407435d20c100e05c4b83', 'expected_feature_count': -1, 'actual_feature_count': 0, 'status': 'to_ingest'}, {'granuleUR': 'SWOT_L2_HR_RiverSP_Reach_021_074_SA_20240913T033713_20240913T033722_PIC0_01.zip', 'revision_date': '2024-09-17T22:05:42.838Z', 'checksum': 'e31692ed41a407435d20c100e05c4b83', 'expected_feature_count': -1, 'actual_feature_count': 0, 'status': 'to_ingest'}]
2024-10-09T20:42:38.080Z [INFO] 2024-10-09T20:42:38.080Z Ingested: [] 

PR checklist:

  • Linted
  • Updated unit tests
  • Updated changelog
  • Integration testing

See Pull Request Review Checklist for pointers on reviewing this pull request

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant