Skip to content

Commit

Permalink
Update README.md (#338)
Browse files Browse the repository at this point in the history
* move assets, update README

* fix

* fix typo

* add target-session-attrs

* read-only mode

* better

* typo

* statistics

* Update README.md

Co-authored-by: Yury Frolov <[email protected]>

---------

Co-authored-by: Yury Frolov <[email protected]>
  • Loading branch information
Denchick and EinKrebs authored Dec 1, 2023
1 parent f624a58 commit 70ef54f
Show file tree
Hide file tree
Showing 5 changed files with 46 additions and 12 deletions.
37 changes: 25 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,33 @@
[![Go](https://github.com/pg-sharding/spqr/actions/workflows/tests.yaml/badge.svg)](https://github.com/pg-sharding/spqr/actions/workflows/tests.yaml)
![GitHub go.mod Go version](https://img.shields.io/github/go-mod/go-version/pg-sharding/spqr)
![Go Report](https://goreportcard.com/badge/github.com/pg-sharding/spqr)
[![Telegram Chat](https://img.shields.io/badge/telegram-SPQR_dev-blue)](https://t.me/+jMGhyjwicpI3ZWQy)

# Stateless Postgres Query Router

SPQR is a system for horizontal scaling of PostgreSQL via sharding. We appreciate any kind of feedback and contribution to the project.
PostgreSQL is awesome, but it's hard to manage a single database with some terabytes of data and 105+ queries per second. Existing sharding solutions focus on analytical and hybrid workloads (OLAP, HTAP). Moreover, most of those solutions do not provide a simple, painless path for the monolith<->sharded transitions. That's why the Data Platform team of Yandex.Cloud designed SPQR.

For more about SPQR, please see [docs/](docs/) and [benchmarks/](benchmarks/).
SPQR is a production-ready system for horizontal scaling of PostgreSQL via sharding. We appreciate any kind of feedback and contribution to the project.

For more about SPQR, please see [docs/](docs/).

## Main features

- Transaction and session pooling
- Multiple routers for fault tolerance
- Sharding
- Liquid data migrations
- Limited multi-shard queries
- Works over PostgreSQL protocol
- Falling unrouted queries to the world shard
- [Minor overhead](https://gitlab.com/postgres-ai/postgresql-consulting/tests-and-benchmarks/-/issues/30) for query execution
- and, of course, TLS support
SPQR works well when you do not have queries that can be loaded strictly on one shard.

- **Sharding**. If possible, the router tries to determine on the first transaction statement to which shard this transaction should be sent. But you can explicitly specify a shard or a [sharding key](https://github.com/pg-sharding/spqr/blob/master/test/regress/tests/router/expected/routing_hint.out#L30) in a comment request.
- **Transaction and session pooling**. Just right in your favorite connection poller (Odyssey or PgBouncer).
- **Multiple routers for fault tolerance**. The router stores the sharding rules only for cache purposes. Information about the entire installation store inside the QDB service, so the number of routers running simultaneously is unlimited.
- **Liquid data migrations**. Data migration between shards aims to balance the workload across shards proportionally. The main idea is to minimize any locking impact during these migrations, which is accomplished by reducing the size of the data ranges being transferred.
- **Limited cross-shard queries**. SPQR router supports limited cross-shard queries. This is made from best-effort logic in a non-disruptive and non-consistent way and is used mainly for testing purposes. Please do not use this in your production.
- **Multiple servers and failover**. In the router configuration, it is possible to specify multiple servers for one shard. Then the router will distribute read-only queries among the replicas. However, in addition to the automatic routing, you also have the option to explicitly define the destination for a specific query by using the [target-session-attr](https://github.com/pg-sharding/spqr/blob/master/test/regress/tests/router/expected/target_session_attrs.out#L32) parameter within the query.
- **Works over PostgreSQL protocol**. It means you can connect to the router and the coordinator via psql.
- **Dedicated read-only mode**. Once enabled, the router will respond to a SHOW transaction_read_only command with "true" and handle only read-only queries, similar to a standard PostgreSQL replica.
- **Minor overhead for query execution**. See benchmarks [here](docs/Benchmarks.md) and [here](https://gitlab.com/postgres-ai/postgresql-consulting/tests-and-benchmarks/-/issues/30).
- **Varias authentication types**. From basic OK and plain text to MD5 and SCRUM, see [Authentication.md](docs/Authentication.md).
- **Live configuration reloading**. You can send a SIGHUP signal to the router's process. This will trigger the router to reload its configuration file and apply any changes without interrupting its operation.
- **Statistics**. You can get access to statitics in router's administrative console via [SHOW command](https://github.com/pg-sharding/spqr/blob/master/yacc/console/gram.y#L319).
- *Falling unrouted queries to the world shard*. SPQR is optimized for single-shard OLTP queries. But we have long-term plans to support routing queries for 2 or more shards.

## Development

Expand All @@ -38,8 +47,12 @@ spqr-router run --config path-to-router-config.yaml

## Tests

SPQR has regression tests. These tests require Docker Compose, and can be run using `make regress`. Also, there are stress tests, but it's a work in progress. For more information on testing, please see `test` and `stress` sections in [Makefile](./Makefile).
SPQR has all types of tests: unit, regress, and end-to-end. These tests require Docker Compose, and can be run using `make`. For more information on testing, please see `unittest`, `regress`, and `feature` sections in [Makefile](./Makefile).

## License

The SPQR source code is distributed under the PostgreSQL Global Development Group License.

## Chat

We have a Telegram chat to discuss SPQR usage and development, to join use [invite link](https://t.me/+jMGhyjwicpI3ZWQy).
21 changes: 21 additions & 0 deletions docs/Benchmarks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# SPQR benchmarks

TPC-C (Transaction Processing Performance Council - C) benchmark is a standardized performance test used to measure the performance of database systems under conditions of high load and a large number of transactions. It simulates the operation of an online store with a large number of simultaneous users, each of whom performs various operations with goods, such as viewing, adding to the cart, buying, etc.

There are a lot of implementations of TPC-C test, in our experiments we use [Percona TPC-C Variant](https://github.com/Percona-Lab/sysbench-tpcc).

We ran PostgreSQL on s3.medium (8 vCPU, 100% vCPU rate, 32 GB RAM) instances and 300 GB of memory with default Managed PostgreSQL Cluster settings. In each test we increasing shard count only.

### Results

| Warehouses | Shards | CPU | TPS | TpmC | TpmC per CPU |
| ---------- | --------- | --- | ---- | ----- | ------------ |
| 1000 | no router | 8 | 433 | 26010 | 3251.25 |
| 1000 | 2 | 16 | 664 | 39840 | 2490 |
| 1000 | 4 | 32 | 875 | 52500 | 1640.625 |
| 1000 | 8 | 64 | 1303 | 78180 | 1221.5625 |
| 1000 | 16 | 128 | 1543 | 92580 | 723.28125 |

![TPC-C test results](resources/tpcc.png)

You can compare this results with [Vitess and Aurora](https://www.amazon.science/publications/amazon-aurora-on-avoiding-distributed-consensus-for-i-os-commits-and-membership-changes), Perfomance Results.
File renamed without changes.
File renamed without changes.
Binary file added docs/resources/tpcc.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 70ef54f

Please sign in to comment.