Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Error Handling and Cleanup during Insert Bulk Process #2897

Conversation

KushaalShroff
Copy link
Contributor

Description

This commit fixes an issue with the error handling and cleanup phase of the Insert Bulk Process.

  1. For Error Handling there was a scenario where bulk_load_callback(0, 0, NULL, NULL) call for cleanup would flush the remaining rows during EndBulkCopy and could result in an error. To fix this, we move the transaction rollback logic from TSQL extension to TDS. We also improved it by using Savepoints in case of active transaction which is aligned with TSQL Behaviour
  2. In this case, if we have a reset-connection after the Bulk Load TDS packet we werent cleaning up the Bulk Load state. To do so we reset the offset.
  3. During Reset Connection TDS is not resetting any TSQL transaction semantic. To resolve this we introduce a wrapper of AbortOutOfAnyTransaction to reset NestedTranCount.

Issues Resolved
BABEL-5200, BABEL-5199, BABEL-5220

Authored-by: Kushaal Shroff [email protected]
Signed-off-by: Kushaal Shroff [email protected]

Test Scenarios Covered

  1. Testing explicit transaction (error case handled in 5.)
    a. Commit without error
    b. Rollback without error
  2. Index with without transaction
  3. Primary Key error case
  4. Unique constraint with error case
  5. Check constraint with error case
    a. transaction testing during error scenarios
    b. @@trancount test - error should not terminate transaction
    c. Test CheckConstraint BCP Option Enabled
    d. Test Reusing the same connection for BCP even after error scenarios
  6. Reset-connection testing with Primary Key error
    The above tests test the seq and index.

For Reset-connection, Although the tests in 6. do reset but for validation through logs I added a temp log to debug TdsResetConnection and with this log its evident that pid 21384 is being reset and reused for Bulk Load

2024-08-27 10:52:28.688 UTC [21384] CONTEXT:  TDS Protocol: Message Type: SQL BATCH, Phase: TDS_REQUEST_PHASE_FETCH. Resetting the TDS connection
2024-08-27 10:52:28.696 UTC [21384] ERROR:  duplicate key value violates unique constraint "destinationtable_pkey"
2024-08-27 10:52:28.696 UTC [21384] DETAIL:  Key (c1)=(2) already exists.
2024-08-27 10:52:28.696 UTC [21384] CONTEXT:  TDS Protocol: Message Type: Bulk Load, Phase: TDS_REQUEST_PHASE_PROCESS. Processing Bulk Load Request
2024-08-27 10:52:28.697 UTC [21384] LOG:  BULK LOAD RESET NOW
2024-08-27 10:52:28.697 UTC [21384] CONTEXT:  TDS Protocol: Message Type: SQL BATCH, Phase: TDS_REQUEST_PHASE_FETCH. Resetting the TDS connection
2024-08-27 10:52:28.706 UTC [21384] ERROR:  duplicate key value violates unique constraint "destinationtable_pkey"
2024-08-27 10:52:28.706 UTC [21384] DETAIL:  Key (c1)=(2) already exists.
2024-08-27 10:52:28.706 UTC [21384] CONTEXT:  TDS Protocol: Message Type: Bulk Load, Phase: TDS_REQUEST_PHASE_PROCESS. Processing Bulk Load Request
2024-08-27 10:52:28.707 UTC [21384] LOG:  BULK LOAD RESET NOW
2024-08-27 10:52:28.707 UTC [21384] CONTEXT:  TDS Protocol: Message Type: SQL BATCH, Phase: TDS_REQUEST_PHASE_FETCH. Resetting the TDS connection
2024-08-27 10:52:28.712 UTC [21384] ERROR:  duplicate key value violates unique constraint "destinationtable_pkey"
2024-08-27 10:52:28.712 UTC [21384] DETAIL:  Key (c1)=(2) already exists.
2024-08-27 10:52:28.712 UTC [21384] CONTEXT:  TDS Protocol: Message Type: Bulk Load, Phase: TDS_REQUEST_PHASE_PROCESS. Processing Bulk Load Request
2024-08-27 10:52:28.714 UTC [21384] LOG:  BULK LOAD RESET NOW
2024-08-27 10:52:28.714 UTC [21384] CONTEXT:  TDS Protocol: Message Type: SQL BATCH, Phase: TDS_REQUEST_PHASE_FETCH. Resetting the TDS connection
2024-08-27 10:52:28.719 UTC [21384] ERROR:  duplicate key value violates unique constraint "destinationtable_pkey"
2024-08-27 10:52:28.719 UTC [21384] DETAIL:  Key (c1)=(2) already exists.
2024-08-27 10:52:28.719 UTC [21384] CONTEXT:  TDS Protocol: Message Type: Bulk Load, Phase: TDS_REQUEST_PHASE_PROCESS. Processing Bulk Load Request
2024-08-27 10:52:28.720 UTC [21384] LOG:  BULK LOAD RESET NOW
2024-08-27 10:52:28.720 UTC [21384] CONTEXT:  TDS Protocol: Message Type: SQL BATCH, Phase: TDS_REQUEST_PHASE_FETCH. Resetting the TDS connection

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is under the terms of the Apache 2.0 and PostgreSQL licenses, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.

For more information on following Developer Certificate of Origin and signing off your commits, please check here.

KushaalShroff and others added 3 commits August 30, 2024 10:09
…for-postgresql#2887)

This commit fixes an issue with the error handling and cleanup phase of the Insert Bulk Process.

1. For Error Handling there was a scenario where bulk_load_callback(0, 0, NULL, NULL) call for cleanup would flush the remaining rows during EndBulkCopy and could result in an error. To fix this, we move the transaction rollback logic from TSQL extension to TDS. We also improved it by using Savepoints in case of active transaction which is aligned with TSQL behaviour.
2. In this case, if we have a reset-connection after the Bulk Load TDS packet we werent cleaning up the Bulk Load state. To do so we reset the offset.
3. During Reset Connection TDS is not resetting any TSQL transaction semantic. To resolve this we introduce a wrapper of AbortOutOfAnyTransaction to reset NestedTranCount.
Issues Resolved
BABEL-5200, BABEL-5199, BABEL-5220

Authored-by: Kushaal Shroff [email protected]
Signed-off-by: Kushaal Shroff [email protected]
Deepesh125
Deepesh125 previously approved these changes Aug 30, 2024
@KushaalShroff KushaalShroff merged commit 0c2878d into babelfish-for-postgresql:BABEL_2_X_DEV Aug 30, 2024
30 checks passed
@KushaalShroff KushaalShroff deleted the bcp_fixes_2x branch August 30, 2024 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants