Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pgd/durabiltity/docs-942 #6011

Merged
merged 7 commits into from
Oct 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -231,7 +231,7 @@ Using `insert_or_error` (or in some cases the `insert_or_skip` conflict resolver

If these are problems, we recommend tuning freezing settings for a table or database so that they're correctly detected as `update_recently_deleted`.

Another alternative is to use [Eager Replication](../eager) to prevent these conflicts.
Another alternative is to use [Eager Replication](../../durability/group-commit#eager-conflict-resolution) to prevent these conflicts.

`INSERT`/`DELETE` conflicts can also occur with three or more nodes. Such a conflict is identical to `INSERT`/`UPDATE` except with the `UPDATE` replaced by a `DELETE`. This can result in a `delete_missing`
conflict.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,5 +35,5 @@ UPDATE tab SET counter = !counter WHERE ...;

"Reliably" means the values don't have the two issues of multiple concurrent resets and divergence.

Operation-based CRDT types can be reset reliably only using [Eager Replication](../eager), since this avoids multiple concurrent resets. You can also use Eager Replication to set either kind of CRDT to a specific
Operation-based CRDT types can be reset reliably only using [Eager Replication](../../durability/group-commit#eager-conflict-resolution), since this avoids multiple concurrent resets. You can also use Eager Replication to set either kind of CRDT to a specific
value.
82 changes: 0 additions & 82 deletions product_docs/docs/pgd/5/consistency/eager.mdx

This file was deleted.

2 changes: 1 addition & 1 deletion product_docs/docs/pgd/5/consistency/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,4 @@ By default, conflicts are resolved at the row level. When changes from two nodes

Column-level conflict detection and resolution is available with PGD, described in [CLCD](column-level-conflicts).

If you want to avoid conflicts, you can use [Group Commit](/pgd/latest/durability/group-commit) with [Eager conflict resolution](eager) or conflict-free data types (CRDTs), described in [CRDT](crdt). You can also use PGD Proxy and route all writes to one write-leader, eliminating the chance for inter-nodal conflicts.
If you want to avoid conflicts, you can use [Group Commit](/pgd/latest/durability/group-commit) with [Eager conflict resolution](../durability/group-commit#eager-conflict-resolution) or conflict-free data types (CRDTs), described in [CRDT](crdt). You can also use PGD Proxy and route all writes to one write-leader, eliminating the chance for inter-nodal conflicts.
81 changes: 80 additions & 1 deletion product_docs/docs/pgd/5/durability/group-commit.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ decision](#commit-decisions) must be set to `raft` to avoid reconciliation
issues.

For details about how Eager conflict resolution works,
see [Eager conflict resolution](../consistency/eager).
see [Eager conflict resolution](../durability/group-commit#eager-conflict-resolution).

### Aborts

Expand Down Expand Up @@ -165,3 +165,82 @@ reconciliation.

This process happens in the background. There's no command for you to use to
control or issue this.

## Eager conflict resolution

Eager conflict resolution (also known as Eager Replication) prevents conflicts by aborting transactions that conflict with each other with serializable errors during the COMMIT decision process.

You configure it using [commit scopes](../durability/commit-scopes) as one of the conflict resolution options for [Group Commit](../durability/group-commit).

### Usage

To enable Eager conflict resolution, the client needs to switch to a commit scope, which uses it at session level or for individual transactions as shown here:

```sql
BEGIN;

SET LOCAL bdr.commit_scope = 'eager_scope';

... other commands possible...
```

The client can continue to issue a `COMMIT` at the end of the transaction and let PGD manage the two phases:

```sql
COMMIT;
```

In this case, the `eager_scope` commit scope is defined something like this:

```sql
SELECT bdr.add_commit_scope(
commit_scope_name := 'eager_scope',
origin_node_group := 'top_group',
rule := 'ALL (top_group) GROUP COMMIT (conflict_resolution = eager, commit_decision = raft) ABORT ON (timeout = 60s)',
wait_for_ready := true
);
```

!!! note Upgrading?
The old `global` commit scope doesn't exist anymore. The above command creates a scope that's the same as the old `global` scope with `bdr.global_commit_timeout` set to `60s`.

The commit scope group for the Eager conflict resolution rule can only be `ALL` or `MAJORITY`. Where `ALL` is used, the `commit_decision` setting must also be set to `raft`.

### Error handling

Given that PGD manages the transaction, the client needs to check only the result of the `COMMIT`. This is advisable in any case, including single-node Postgres.

In case of an origin node failure, the remaining nodes eventually (after at least `ABORT ON timeout`) decide to roll back the globally prepared transaction. Raft prevents inconsistent commit versus rollback decisions. However, this requires a majority of connected nodes. Disconnected nodes keep the transactions prepared to eventually commit them (or roll back) as needed to reconcile with the majority of nodes that might have decided and made further progress.

### Effects of Eager Replication in general

#### Increased abort rate

With single-node Postgres, or even with PGD in its default asynchronous
replication mode, errors at `COMMIT` time are rare. The added synchronization
step due to the use of a commit scope using `eager`
for conflict resolution also adds a source of errors. Applications need to be
prepared to properly handle such errors, usually by applying a retry loop.

The rate of aborts depends solely on the workload. Large transactions changing many rows are much more likely to conflict with other concurrent transactions.

### Effects of MAJORITY and ALL node replication in general

#### Increased commit latency

Adding a synchronization step due to the use of a commit scope means more
communication between the nodes, resulting in more latency at commit time. When
`ALL` is used in the commit scope, this also means that the availability of the
system is reduced, since any node going down causes transactions to fail.

If one or more nodes are lagging behind, the round-trip delay in getting
confirmations can be large, causing high latencies. ALL or MAJORITY node replication adds
roughly two network round trips (to the furthest peer node in the worst case).
Logical standby nodes and nodes still in the process of joining or catching up
aren't included but eventually receive changes.

Before a peer node can confirm its local preparation of the transaction, it also
needs to apply it locally. This further adds to the commit latency, depending on
the size of the transaction. This setting is independent of the
`synchronous_commit` setting.

2 changes: 1 addition & 1 deletion product_docs/docs/pgd/5/durability/limitations.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ nodes in a group. If you use this feature, take the following limitations into a

## Eager

[Eager](../consistency/eager) is available through Group Commit. It avoids conflicts by eagerly aborting transactions that might clash. It's subject to the same limitations as Group Commit.
[Eager](../durability/group-commit#eager-conflict-resolution) is available through Group Commit. It avoids conflicts by eagerly aborting transactions that might clash. It's subject to the same limitations as Group Commit.

Eager doesn't allow the `NOTIFY` SQL command or the `pg_notify()` function. It
also doesn't allow `LISTEN` or `UNLISTEN`.
Expand Down
2 changes: 1 addition & 1 deletion product_docs/docs/pgd/5/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ Read about why PostgreSQL is better when it’s distributed with EDB Postgres Di
By default, EDB Postgres Distributed uses asynchronous replication, applying changes on
the peer nodes only after the local commit. You can configure additional levels of synchronicity between different nodes, groups of nodes, or all nodes by configuring
[Group Commit](durability/group-commit), [CAMO](durability/camo), or
[Eager Replication](consistency/eager).
[Eager Replication](durability/group-commit#eager-conflict-resolution).

## Compatibility

Expand Down
2 changes: 1 addition & 1 deletion product_docs/docs/pgd/5/planning/choosing_server.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ The following table lists features of EDB Postgres Distributed that are dependen
| [Legacy synchronous replication](/pgd/latest/durability/legacy-sync) | Y | Y | Y |
| [Group Commit](/pgd/latest/durability/group-commit/) | N | Y | 14+ |
| [Commit At Most Once (CAMO)](/pgd/latest/durability/camo/) | N | Y | 14+ |
| [Eager Conflict Resolution](/pgd/latest/consistency/eager/) | N | Y | 14+ |
| [Eager Conflict Resolution](/pgd/latest/durability/group-commit#eager-conflict-resolution) | N | Y | 14+ |
| [Lag Control](/pgd/latest/durability/lag-control/) | N | Y | 14+ |
| [Decoding Worker](/pgd/latest/node_management/decoding_worker) | N | 13+ | 14+ |
| [Lag tracker](/pgd/latest/monitoring/sql/#monitoring-outgoing-replication) | N | Y | 14+ |
Expand Down
Loading