-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
add sharding, search, sorting, soft removal
- Loading branch information
1 parent
7324c68
commit a13f22a
Showing
92 changed files
with
2,181 additions
and
3,867 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,19 +1,16 @@ | ||
name: test | ||
|
||
on: [ push ] | ||
on: push | ||
|
||
jobs: | ||
test: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v2.3.4 | ||
- uses: actions/checkout@v4.1.1 | ||
with: | ||
clean: false | ||
fetch-depth: 0 # with tags | ||
submodules: 'recursive' | ||
- uses: actions/[email protected] | ||
submodules: recursive | ||
- uses: actions/[email protected] | ||
with: | ||
java-version: 11 | ||
- env: | ||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} | ||
run: sbt +compile +test | ||
java-version: 22 | ||
distribution: temurin | ||
- run: sbt test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,5 @@ | ||
.DS_Store | ||
target/ | ||
/tmp | ||
/.bsp/ | ||
.bloop | ||
.metals | ||
.vscode | ||
project/project | ||
project/metals.sbt | ||
/data/example-*/ | ||
/data/test-*/ | ||
target/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
[submodule "proto"] | ||
path = deps/proto | ||
url = https://github.com/zero-deps/proto | ||
url = git@github.com:zero-deps/proto.git | ||
branch = main |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
-J-XX:MaxMetaspaceSize=512m | ||
-J-XX:MaxMetaspaceSize=1g |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,49 +1,141 @@ | ||
# Abstract scala type database | ||
# KVS (Key-Value Storage) | ||
|
||
![ci](https://github.com/zero-deps/kvs/workflows/ci/badge.svg) | ||
## Scala Abstract Type Database | ||
|
||
Abstract Scala storage framework with high-level API for handling linked lists of polymorphic data (feeds). | ||
![Production Ready](https://img.shields.io/badge/Project%20Stage-Production%20Ready%20(main)-brightgreen.svg) | ||
![Development](https://img.shields.io/badge/Project%20Stage-Development%20(next)-yellowgreen.svg) | ||
|
||
KVS is highly available distributed (AP) strong eventual consistent (SEC) and sequentially consistent (via cluster sharding) storage. It is used for data from sport and games events. In some configurations used as distributed network file system. Also can be a generic storage for application. | ||
[![Paper](https://img.shields.io/badge/paper-pdf-lightgrey)](https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf) | ||
|
||
Designed with various backends in mind and to work in pure JVM environment. Implementation based on top of KAI (implementation of Amazon DynamoDB in Erlang) port with modification to use akka-cluster infrastructure. | ||
This open-source project presents an abstract storage framework in Scala, offering a high-level API tailored for managing linked lists of polymorphic data, referred to as 'feeds.' The system, known as KVS (Key-Value Storage), boasts attributes such as high availability, distributed architecture (AP), strong eventual consistency (SEC), and sequential consistency achieved through cluster sharding. Its primary application involves handling data from sports and gaming events, but it can also serve as a distributed network file system or a versatile general-purpose storage solution for various applications. | ||
|
||
Currently main backend is RocksDB to support embedded setup alongside application. Feed API (add/entries/remove) is built on top of Key-Value API (put/get/delete). | ||
The design philosophy behind KVS encompasses versatility, with support for multiple backend implementations and compatibility within a pure JVM environment. The implementation is grounded in the KAI framework (an Erlang-based Amazon DynamoDB implementation), adapted to utilize the pekko-cluster infrastructure. | ||
|
||
At its core, KVS relies on RocksDB as the primary backend, enabling seamless integration in embedded setups alongside applications. The central Feed API, facilitating operations like addition, entry retrieval, and removal, is constructed upon the foundation of the Key-Value API, which includes functions for putting, getting, and deleting data. | ||
|
||
## Usage | ||
|
||
Add project as a git module. | ||
Add the project as a git submodule or publish in repository of own choice. | ||
|
||
## Backend | ||
## Components | ||
|
||
* Ring | ||
* RocksDB | ||
* Memory | ||
* FS | ||
* SQL | ||
* etc. | ||
* `rng`: Establishes a Ring structure using Pekko Cluster | ||
* `search`: Offers Search over KVS functionality | ||
* `consistency`: Sequential consistency for complex inserts (e.g. feeds) | ||
* `feed`: Introduces the Feed over Ring concept | ||
* `sharding`: Pekko's Cluster Sharding abstraction for sequential consistency | ||
* `sort`: Implements a Sorted Set on Ring | ||
|
||
## Test | ||
## Test and Demo | ||
|
||
```bash | ||
sbt> test | ||
```sh | ||
sbt test | ||
sbt run | ||
``` | ||
|
||
## Resources | ||
# Documentation | ||
|
||
## Abstract | ||
|
||
KVS is an abstract **Scala Types** database that enables the construction of storage schemes centered around a **linked list of entities** (data feeds). It is supported by multiple backend storage engines, making it suitable for various needs. When used with the **RING backend**, it becomes a powerful tool for managing distributed data while ensuring **sequential consistency** when integrated with **FeedServer**. | ||
|
||
## Core Components | ||
|
||
### Services Handlers | ||
|
||
- **KVS**: An abstract data types **Key-Value Storage** for Scala value types, featuring a simple `put/get` API, as well as extended operations for managing data feeds. | ||
- **Backend Storage Engines**: | ||
- **RING**: A distributed, scalable, and fault-tolerant key-value store. | ||
- **LevelDB**: For non-clustered, single-node environments. | ||
- **Memory**: In-memory storage, useful for caching or testing. | ||
- **Filesystem**: Uses the filesystem for storage. | ||
|
||
### Services Handler Specificity | ||
|
||
Each service can operate on specific data types, requiring logic for serializing, pickling, or marshalling the data. Handlers ensure compatibility with particular data types during the serialization process. | ||
|
||
## Datastore JMX Interface | ||
|
||
The **Java Management Extensions (JMX)** interface allows for remote management of the KVS system. KVS registers two **MBeans**: `Kvs` and `Ring`. | ||
|
||
- **MBean Access**: Tools like `jconsole` can be used for interacting with these resources, typically located in `$JDKHOME/bin`. | ||
- **Core JMX Operations**: | ||
- **allStr(fid: String): String**: Returns a string representation of all entities within a specified feed. | ||
- **Kvs:save**: Initiates a save operation to create a zip archive of distributed data across the nodes. | ||
- **Kvs:load(path)**: Loads data from a previously saved archive, ensuring quorum configuration is satisfied during the write process. | ||
- **Ring:get(key: String): String**: Retrieves a value by its key. | ||
- **Ring:put(key: String, data: String): String**: Inserts a value associated with a key (primarily for testing purposes). | ||
- **Ring:delete(key: String)**: Deletes the value associated with a key. | ||
|
||
> **Note**: RNG becomes readonly during the save and load processes to maintain consistency. | ||
## Features and Capabilities | ||
|
||
- **Sequential Consistency**: Managed through the **FeedServer**, KVS ensures sequential consistency in operations. | ||
- **Linked Lists**: Data entries in KVS are stored as a doubly-linked list, enabling easy navigation and manipulation of entities. | ||
- **Scala Pickling**: Serialization is managed using **Scala Pickling**, which requires defining picklers at compile-time for your data. | ||
- **Backend Flexibility**: The system supports multiple backends (LevelDB, RING, Memory), allowing it to be tailored for different use cases. | ||
- **Scalaz Tagged Types**: Leverages **Scalaz** tagged types for enhanced type safety at compile-time, ensuring operations on incompatible data types are caught early. | ||
|
||
## Distributed Data Handling | ||
|
||
### RING Datastore | ||
|
||
**RING** (or RNG) is a distributed **key-value store** inspired by Amazon’s **Dynamo**. It is implemented using **Akka** for high availability, fault tolerance, and scalability. | ||
|
||
#### Configuration Options | ||
|
||
- **Quorum Configuration**: Defined by the parameters `N`, `W`, and `R`: | ||
- **N**: Total number of replicas. | ||
- **R**: Number of replicas required for a successful read. | ||
- **W**: Number of replicas required for a successful write. | ||
|
||
**Rules for Consistency**: | ||
- \( R + W > N \) | ||
- \( W > \frac{N}{2} \) | ||
|
||
#### Consistent Hashing | ||
|
||
RING uses **consistent hashing** to distribute data across the cluster, ensuring minimal data movement when nodes join or leave. | ||
|
||
#### Vector Clocks | ||
|
||
**Vector clocks** are employed to maintain a partial ordering of events across the cluster, detecting conflicts during write operations and resolving them during reads. | ||
|
||
#### Fault Tolerance and Availability | ||
|
||
**Gossip protocols** are utilized for membership and failure detection, ensuring fault tolerance and resilience across the cluster. | ||
|
||
#### Default Configuration | ||
|
||
- **Quorum**: `[1, 1, 1]` | ||
- **Buckets**: `1024` | ||
- **Virtual Nodes**: `128` | ||
- **Hash Length**: `32` | ||
- **Gather Timeout**: `3 seconds` | ||
|
||
> This configuration ensures that the system works out-of-the-box for a single-node deployment. | ||
## Metrics and Monitoring | ||
|
||
### Chain Replication | ||
KVS/RING exposes various metrics to track system health and load: | ||
|
||
[Chain Replication in Theory and in Practice](http://www.snookles.com/scott/publications/erlang2010-slf.pdf) | ||
- **Disk/Memory Usage**: Tracks available disk space, file descriptors, swap usage, and IO wait times. | ||
- **Read/Write Operations**: Monitors consistent reads and writes coordinated by each node. | ||
- **Network Throughput**: Tracks latency and general network health. | ||
- **Search Metrics**: Provides insights into indexing errors and search query performance. | ||
|
||
[Chain Replication for Supporting High Throughput and Availability](http://www.cs.cornell.edu/home/rvr/papers/OSDI04.pdf) | ||
These metrics can be monitored through the **JMX interface**, providing detailed insights into the system's performance. | ||
|
||
[High-throughput chain replication for read-mostly workload](https://www.cs.princeton.edu/courses/archive/fall15/cos518/papers/craq.pdf) | ||
## Future Improvements | ||
|
||
[Leveraging Sharding in the Design of Scalable Replication Protocols](https://ymsir.com/papers/sharding-socc.pdf) | ||
To further enhance KVS, consider implementing the following: | ||
|
||
[Byzantine Chain Replication](http://www.cs.cornell.edu/home/rvr/newpapers/opodis2012.pdf) | ||
1. **Secondary Indexes**: Support for secondary indexing to improve the performance of complex queries. | ||
2. **Schema Versioning**: Implementing schema versioning would enable backward compatibility during upgrades. | ||
3. **Better Error Handling**: Provide more detailed feedback during quorum failures and node crashes. | ||
4. **Improved Serialization**: Explore alternatives to Scala Pickling for more efficient serialization, especially in cross-version compatibility scenarios. | ||
|
||
### Consensus Algorithm | ||
## Conclusion | ||
|
||
[RAFT](https://raft.github.io/raft.pdf) | ||
[SWIM](https://www.cs.cornell.edu/projects/Quicksilver/public_pdfs/SWIM.pdf) | ||
KVS is a flexible, distributed, and fault-tolerant key-value storage system designed for high scalability and sequential consistency. Its modular architecture and support for various backends make it a versatile choice for different data storage requirements. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,53 +1,53 @@ | ||
val scalav = "3.2.2" | ||
val zio = "2.0.10" | ||
val akka = "2.6.20" | ||
val rocks = "7.10.2" | ||
val protoj = "3.22.2" | ||
val lucene = "8.11.2" | ||
val scalav = "3.3.3" | ||
val zio = "2.1.9" | ||
val pekko = "1.1.1" | ||
val rocks = "9.6.1" | ||
val protoj = "4.28.2" | ||
val lucene = "9.11.1" | ||
|
||
lazy val root = project.in(file(".") ).aggregate(kvs) | ||
lazy val `kvs-root` = project.in(file(".")).settings( | ||
scalaVersion := scalav | ||
, libraryDependencies ++= Seq( | ||
"dev.zio" %% "zio-test-sbt" % zio % Test | ||
, "org.apache.pekko" %% "pekko-cluster-sharding" % pekko | ||
) | ||
, scalacOptions ++= Seq( | ||
"-language:strictEquality" | ||
, "-Wunused:imports" | ||
, "-Xfatal-warnings" | ||
, "-Yexplicit-nulls" | ||
) | ||
, run / fork := true | ||
, run / javaOptions += "--add-modules=jdk.incubator.vector" | ||
, run / connectInput := true | ||
).dependsOn(kvs).aggregate(kvs) | ||
|
||
lazy val kvs = project.in(file("kvs")).settings( | ||
scalaVersion := scalav | ||
, libraryDependencies ++= Seq( | ||
"com.typesafe.akka" % "akka-cluster-sharding_2.13" % akka | ||
, "com.typesafe.akka" % "akka-slf4j_2.13" % akka | ||
, "ch.qos.logback" % "logback-classic" % "1.4.5" | ||
, "com.github.jnr" % "jnr-ffi" % "2.2.2" | ||
, "org.apache.lucene" % "lucene-analyzers-common" % lucene | ||
, "dev.zio" %% "zio" % zio | ||
, "dev.zio" %% "zio-nio" % "2.0.0" | ||
"dev.zio" %% "zio-streams" % zio | ||
, "dev.zio" %% "zio-test-sbt" % zio % Test | ||
, "org.apache.lucene" % "lucene-analysis-common" % lucene | ||
, "org.apache.pekko" %% "pekko-cluster-sharding" % pekko | ||
, "org.rocksdb" % "rocksdbjni" % rocks | ||
, "org.scalatest" %% "scalatest" % "3.2.14" % Test | ||
, "com.typesafe.akka" % "akka-testkit_2.13" % akka % Test | ||
) | ||
, scalacOptions ++= scalacOptions3 | ||
, testFrameworks += new TestFramework("zio.test.sbt.ZTestFramework") | ||
, Test / fork := true | ||
, scalacOptions ++= Seq( | ||
"-language:strictEquality" | ||
, "-Wunused:imports" | ||
, "-Xfatal-warnings" | ||
, "-Yexplicit-nulls" | ||
) | ||
).dependsOn(proto) | ||
|
||
lazy val proto = project.in(file("deps/proto/proto")).settings( | ||
scalaVersion := scalav | ||
, crossScalaVersions := scalav :: Nil | ||
, libraryDependencies += "com.google.protobuf" % "protobuf-java" % protoj | ||
).dependsOn(protoops) | ||
).dependsOn(`proto-syntax`) | ||
|
||
lazy val protoops = project.in(file("deps/proto/ops")).settings( | ||
lazy val `proto-syntax` = project.in(file("deps/proto/syntax")).settings( | ||
scalaVersion := scalav | ||
, crossScalaVersions := scalav :: Nil | ||
).dependsOn(protosyntax) | ||
|
||
lazy val protosyntax = project.in(file("deps/proto/syntax")).settings( | ||
scalaVersion := scalav | ||
, crossScalaVersions := scalav :: Nil | ||
) | ||
|
||
val scalacOptions3 = Seq( | ||
"-source:future", "-nowarn" | ||
, "-language:strictEquality" | ||
, "-language:postfixOps" | ||
, "-Yexplicit-nulls" | ||
, "-encoding", "UTF-8" | ||
) | ||
|
||
turbo := true | ||
useCoursier := true | ||
Global / onChangedBuildSource := ReloadOnSourceChanges |
Submodule proto
updated
56 files
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
Oops, something went wrong.