Skip to content

Commit

Permalink
add sharding, search, sorting, soft removal
Browse files Browse the repository at this point in the history
  • Loading branch information
tellnobody1 committed Sep 28, 2024
1 parent 7324c68 commit a13f22a
Show file tree
Hide file tree
Showing 92 changed files with 2,181 additions and 3,867 deletions.
17 changes: 7 additions & 10 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -1,19 +1,16 @@
name: test

on: [ push ]
on: push

jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2.3.4
- uses: actions/checkout@v4.1.1
with:
clean: false
fetch-depth: 0 # with tags
submodules: 'recursive'
- uses: actions/[email protected]
submodules: recursive
- uses: actions/[email protected]
with:
java-version: 11
- env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: sbt +compile +test
java-version: 22
distribution: temurin
- run: sbt test
10 changes: 3 additions & 7 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,9 +1,5 @@
.DS_Store
target/
/tmp
/.bsp/
.bloop
.metals
.vscode
project/project
project/metals.sbt
/data/example-*/
/data/test-*/
target/
2 changes: 1 addition & 1 deletion .gitmodules
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[submodule "proto"]
path = deps/proto
url = https://github.com/zero-deps/proto
url = git@github.com:zero-deps/proto.git
branch = main
2 changes: 1 addition & 1 deletion .sbtopts
Original file line number Diff line number Diff line change
@@ -1 +1 @@
-J-XX:MaxMetaspaceSize=512m
-J-XX:MaxMetaspaceSize=1g
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
MIT License

Copyright (c) 2021 zero-deps
Copyright (c) 2015–2024 zero-deps

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
146 changes: 119 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,49 +1,141 @@
# Abstract scala type database
# KVS (Key-Value Storage)

![ci](https://github.com/zero-deps/kvs/workflows/ci/badge.svg)
## Scala Abstract Type Database

Abstract Scala storage framework with high-level API for handling linked lists of polymorphic data (feeds).
![Production Ready](https://img.shields.io/badge/Project%20Stage-Production%20Ready%20(main)-brightgreen.svg)
![Development](https://img.shields.io/badge/Project%20Stage-Development%20(next)-yellowgreen.svg)

KVS is highly available distributed (AP) strong eventual consistent (SEC) and sequentially consistent (via cluster sharding) storage. It is used for data from sport and games events. In some configurations used as distributed network file system. Also can be a generic storage for application.
[![Paper](https://img.shields.io/badge/paper-pdf-lightgrey)](https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf)

Designed with various backends in mind and to work in pure JVM environment. Implementation based on top of KAI (implementation of Amazon DynamoDB in Erlang) port with modification to use akka-cluster infrastructure.
This open-source project presents an abstract storage framework in Scala, offering a high-level API tailored for managing linked lists of polymorphic data, referred to as 'feeds.' The system, known as KVS (Key-Value Storage), boasts attributes such as high availability, distributed architecture (AP), strong eventual consistency (SEC), and sequential consistency achieved through cluster sharding. Its primary application involves handling data from sports and gaming events, but it can also serve as a distributed network file system or a versatile general-purpose storage solution for various applications.

Currently main backend is RocksDB to support embedded setup alongside application. Feed API (add/entries/remove) is built on top of Key-Value API (put/get/delete).
The design philosophy behind KVS encompasses versatility, with support for multiple backend implementations and compatibility within a pure JVM environment. The implementation is grounded in the KAI framework (an Erlang-based Amazon DynamoDB implementation), adapted to utilize the pekko-cluster infrastructure.

At its core, KVS relies on RocksDB as the primary backend, enabling seamless integration in embedded setups alongside applications. The central Feed API, facilitating operations like addition, entry retrieval, and removal, is constructed upon the foundation of the Key-Value API, which includes functions for putting, getting, and deleting data.

## Usage

Add project as a git module.
Add the project as a git submodule or publish in repository of own choice.

## Backend
## Components

* Ring
* RocksDB
* Memory
* FS
* SQL
* etc.
* `rng`: Establishes a Ring structure using Pekko Cluster
* `search`: Offers Search over KVS functionality
* `consistency`: Sequential consistency for complex inserts (e.g. feeds)
* `feed`: Introduces the Feed over Ring concept
* `sharding`: Pekko's Cluster Sharding abstraction for sequential consistency
* `sort`: Implements a Sorted Set on Ring

## Test
## Test and Demo

```bash
sbt> test
```sh
sbt test
sbt run
```

## Resources
# Documentation

## Abstract

KVS is an abstract **Scala Types** database that enables the construction of storage schemes centered around a **linked list of entities** (data feeds). It is supported by multiple backend storage engines, making it suitable for various needs. When used with the **RING backend**, it becomes a powerful tool for managing distributed data while ensuring **sequential consistency** when integrated with **FeedServer**.

## Core Components

### Services Handlers

- **KVS**: An abstract data types **Key-Value Storage** for Scala value types, featuring a simple `put/get` API, as well as extended operations for managing data feeds.
- **Backend Storage Engines**:
- **RING**: A distributed, scalable, and fault-tolerant key-value store.
- **LevelDB**: For non-clustered, single-node environments.
- **Memory**: In-memory storage, useful for caching or testing.
- **Filesystem**: Uses the filesystem for storage.

### Services Handler Specificity

Each service can operate on specific data types, requiring logic for serializing, pickling, or marshalling the data. Handlers ensure compatibility with particular data types during the serialization process.

## Datastore JMX Interface

The **Java Management Extensions (JMX)** interface allows for remote management of the KVS system. KVS registers two **MBeans**: `Kvs` and `Ring`.

- **MBean Access**: Tools like `jconsole` can be used for interacting with these resources, typically located in `$JDKHOME/bin`.
- **Core JMX Operations**:
- **allStr(fid: String): String**: Returns a string representation of all entities within a specified feed.
- **Kvs:save**: Initiates a save operation to create a zip archive of distributed data across the nodes.
- **Kvs:load(path)**: Loads data from a previously saved archive, ensuring quorum configuration is satisfied during the write process.
- **Ring:get(key: String): String**: Retrieves a value by its key.
- **Ring:put(key: String, data: String): String**: Inserts a value associated with a key (primarily for testing purposes).
- **Ring:delete(key: String)**: Deletes the value associated with a key.

> **Note**: RNG becomes readonly during the save and load processes to maintain consistency.
## Features and Capabilities

- **Sequential Consistency**: Managed through the **FeedServer**, KVS ensures sequential consistency in operations.
- **Linked Lists**: Data entries in KVS are stored as a doubly-linked list, enabling easy navigation and manipulation of entities.
- **Scala Pickling**: Serialization is managed using **Scala Pickling**, which requires defining picklers at compile-time for your data.
- **Backend Flexibility**: The system supports multiple backends (LevelDB, RING, Memory), allowing it to be tailored for different use cases.
- **Scalaz Tagged Types**: Leverages **Scalaz** tagged types for enhanced type safety at compile-time, ensuring operations on incompatible data types are caught early.

## Distributed Data Handling

### RING Datastore

**RING** (or RNG) is a distributed **key-value store** inspired by Amazon’s **Dynamo**. It is implemented using **Akka** for high availability, fault tolerance, and scalability.

#### Configuration Options

- **Quorum Configuration**: Defined by the parameters `N`, `W`, and `R`:
- **N**: Total number of replicas.
- **R**: Number of replicas required for a successful read.
- **W**: Number of replicas required for a successful write.

**Rules for Consistency**:
- \( R + W > N \)
- \( W > \frac{N}{2} \)

#### Consistent Hashing

RING uses **consistent hashing** to distribute data across the cluster, ensuring minimal data movement when nodes join or leave.

#### Vector Clocks

**Vector clocks** are employed to maintain a partial ordering of events across the cluster, detecting conflicts during write operations and resolving them during reads.

#### Fault Tolerance and Availability

**Gossip protocols** are utilized for membership and failure detection, ensuring fault tolerance and resilience across the cluster.

#### Default Configuration

- **Quorum**: `[1, 1, 1]`
- **Buckets**: `1024`
- **Virtual Nodes**: `128`
- **Hash Length**: `32`
- **Gather Timeout**: `3 seconds`

> This configuration ensures that the system works out-of-the-box for a single-node deployment.
## Metrics and Monitoring

### Chain Replication
KVS/RING exposes various metrics to track system health and load:

[Chain Replication in Theory and in Practice](http://www.snookles.com/scott/publications/erlang2010-slf.pdf)
- **Disk/Memory Usage**: Tracks available disk space, file descriptors, swap usage, and IO wait times.
- **Read/Write Operations**: Monitors consistent reads and writes coordinated by each node.
- **Network Throughput**: Tracks latency and general network health.
- **Search Metrics**: Provides insights into indexing errors and search query performance.

[Chain Replication for Supporting High Throughput and Availability](http://www.cs.cornell.edu/home/rvr/papers/OSDI04.pdf)
These metrics can be monitored through the **JMX interface**, providing detailed insights into the system's performance.

[High-throughput chain replication for read-mostly workload](https://www.cs.princeton.edu/courses/archive/fall15/cos518/papers/craq.pdf)
## Future Improvements

[Leveraging Sharding in the Design of Scalable Replication Protocols](https://ymsir.com/papers/sharding-socc.pdf)
To further enhance KVS, consider implementing the following:

[Byzantine Chain Replication](http://www.cs.cornell.edu/home/rvr/newpapers/opodis2012.pdf)
1. **Secondary Indexes**: Support for secondary indexing to improve the performance of complex queries.
2. **Schema Versioning**: Implementing schema versioning would enable backward compatibility during upgrades.
3. **Better Error Handling**: Provide more detailed feedback during quorum failures and node crashes.
4. **Improved Serialization**: Explore alternatives to Scala Pickling for more efficient serialization, especially in cross-version compatibility scenarios.

### Consensus Algorithm
## Conclusion

[RAFT](https://raft.github.io/raft.pdf)
[SWIM](https://www.cs.cornell.edu/projects/Quicksilver/public_pdfs/SWIM.pdf)
KVS is a flexible, distributed, and fault-tolerant key-value storage system designed for high scalability and sequential consistency. Its modular architecture and support for various backends make it a versatile choice for different data storage requirements.
72 changes: 36 additions & 36 deletions build.sbt
Original file line number Diff line number Diff line change
@@ -1,53 +1,53 @@
val scalav = "3.2.2"
val zio = "2.0.10"
val akka = "2.6.20"
val rocks = "7.10.2"
val protoj = "3.22.2"
val lucene = "8.11.2"
val scalav = "3.3.3"
val zio = "2.1.9"
val pekko = "1.1.1"
val rocks = "9.6.1"
val protoj = "4.28.2"
val lucene = "9.11.1"

lazy val root = project.in(file(".") ).aggregate(kvs)
lazy val `kvs-root` = project.in(file(".")).settings(
scalaVersion := scalav
, libraryDependencies ++= Seq(
"dev.zio" %% "zio-test-sbt" % zio % Test
, "org.apache.pekko" %% "pekko-cluster-sharding" % pekko
)
, scalacOptions ++= Seq(
"-language:strictEquality"
, "-Wunused:imports"
, "-Xfatal-warnings"
, "-Yexplicit-nulls"
)
, run / fork := true
, run / javaOptions += "--add-modules=jdk.incubator.vector"
, run / connectInput := true
).dependsOn(kvs).aggregate(kvs)

lazy val kvs = project.in(file("kvs")).settings(
scalaVersion := scalav
, libraryDependencies ++= Seq(
"com.typesafe.akka" % "akka-cluster-sharding_2.13" % akka
, "com.typesafe.akka" % "akka-slf4j_2.13" % akka
, "ch.qos.logback" % "logback-classic" % "1.4.5"
, "com.github.jnr" % "jnr-ffi" % "2.2.2"
, "org.apache.lucene" % "lucene-analyzers-common" % lucene
, "dev.zio" %% "zio" % zio
, "dev.zio" %% "zio-nio" % "2.0.0"
"dev.zio" %% "zio-streams" % zio
, "dev.zio" %% "zio-test-sbt" % zio % Test
, "org.apache.lucene" % "lucene-analysis-common" % lucene
, "org.apache.pekko" %% "pekko-cluster-sharding" % pekko
, "org.rocksdb" % "rocksdbjni" % rocks
, "org.scalatest" %% "scalatest" % "3.2.14" % Test
, "com.typesafe.akka" % "akka-testkit_2.13" % akka % Test
)
, scalacOptions ++= scalacOptions3
, testFrameworks += new TestFramework("zio.test.sbt.ZTestFramework")
, Test / fork := true
, scalacOptions ++= Seq(
"-language:strictEquality"
, "-Wunused:imports"
, "-Xfatal-warnings"
, "-Yexplicit-nulls"
)
).dependsOn(proto)

lazy val proto = project.in(file("deps/proto/proto")).settings(
scalaVersion := scalav
, crossScalaVersions := scalav :: Nil
, libraryDependencies += "com.google.protobuf" % "protobuf-java" % protoj
).dependsOn(protoops)
).dependsOn(`proto-syntax`)

lazy val protoops = project.in(file("deps/proto/ops")).settings(
lazy val `proto-syntax` = project.in(file("deps/proto/syntax")).settings(
scalaVersion := scalav
, crossScalaVersions := scalav :: Nil
).dependsOn(protosyntax)

lazy val protosyntax = project.in(file("deps/proto/syntax")).settings(
scalaVersion := scalav
, crossScalaVersions := scalav :: Nil
)

val scalacOptions3 = Seq(
"-source:future", "-nowarn"
, "-language:strictEquality"
, "-language:postfixOps"
, "-Yexplicit-nulls"
, "-encoding", "UTF-8"
)

turbo := true
useCoursier := true
Global / onChangedBuildSource := ReloadOnSourceChanges
2 changes: 1 addition & 1 deletion deps/proto
Submodule proto updated 56 files
+9 −17 .github/workflows/test.yml
+1 −6 .gitignore
+2 −1 .sbtopts
+5 −0 .sdkmanrc
+1 −1 LICENSE
+42 −12 README.md
+0 −214 bench/src/main/scala-2.13/data.scala
+0 −72 bench/src/main/scala-2.13/msg.scala
+95 −0 bench/src/main/scala-3/data.scala
+21 −0 bench/src/main/scala-3/msg.scala
+100 −54 build.sbt
+1 −1 project/build.properties
+10 −3 project/plugins.sbt
+303 −0 proto/src/main/scala-2.11/BuildCodec.scala
+158 −0 proto/src/main/scala-2.11/Common.scala
+30 −0 proto/src/main/scala-2.11/api.scala
+14 −0 proto/src/main/scala-2.11/codec.scala
+233 −0 proto/src/main/scala-2.11/macrosapi.scala
+1 −1 proto/src/main/scala-2.12/BuildCodec.scala
+4 −0 proto/src/main/scala-2.12/Common.scala
+1 −1 proto/src/main/scala-2.13/BuildCodec.scala
+4 −0 proto/src/main/scala-2.13/Common.scala
+9 −11 proto/src/main/scala-3/BuildCodec.scala
+4 −0 proto/src/main/scala-3/Common.scala
+17 −19 proto/src/main/scala-3/macrosapi.scala
+520 −0 proto/src/test/scala-2.11/testing.scala
+27 −27 proto/src/test/scala-2.13/testing.scala
+0 −0 purs/src/main/scala-2.13/ops.scala
+0 −11 purs/src/main/scala-3.0/io.scala
+0 −252 purs/src/main/scala-3.0/purs.scala
+82 −0 purs/src/main/scala-3/Common.scala
+308 −0 purs/src/main/scala-3/decoders.scala
+154 −0 purs/src/main/scala-3/encoders.scala
+13 −0 purs/src/main/scala-3/io.scala
+172 −0 purs/src/main/scala-3/main.scala
+224 −0 purs/src/main/scala-3/ops.scala
+85 −0 purs/src/main/scala-3/purs.scala
+6 −0 purs/src/main/scala-3/tpe.scala
+26 −0 purs/src/test/scala-3/default.test.scala
+25 −0 purs/src/test/scala-3/enum.test.scala
+28 −0 purs/src/test/scala-3/eq.test.scala
+135 −0 purs/src/test/scala-3/purs.test.scala
+26 −0 purs/src/test/scala-3/setmap.test.scala
+499 −463 purs/test/bin/compiled.js
+2,247 −420 purs/test/package-lock.json
+2 −3 purs/test/package.json
+2 −107 purs/test/packages.dhall
+13 −5 purs/test/spago.dhall
+9 −4 tex/src/main/scala-2.13/main.scala
+216 −0 tex/src/main/scala-2.13/ops.scala
+203 −2 tex/src/main/scala-2.13/tex.scala
+82 −0 tex/src/main/scala-3/Common.scala
+7 −0 tex/src/main/scala-3/ext.scala
+94 −0 tex/src/main/scala-3/main.scala
+224 −0 tex/src/main/scala-3/ops.scala
+326 −0 tex/src/main/scala-3/tex.scala
1 change: 0 additions & 1 deletion docs/.gitignore

This file was deleted.

6 changes: 0 additions & 6 deletions docs/about.tex

This file was deleted.

18 changes: 0 additions & 18 deletions docs/info.tex

This file was deleted.

Loading

0 comments on commit a13f22a

Please sign in to comment.