Skip to content
/ Jungle Public
forked from eBay/Jungle

An embedded key-value store library specialized for building state machine and log store

License

Notifications You must be signed in to change notification settings

ZexiLiu/Jungle

 
 

Repository files navigation

Jungle

build codecov

Embedded key-value storage library, based on a combined index of LSM-tree and copy-on-write (append-only) B+tree. Please refer to our paper.

Jungle is specialized for building replicated state machine of consensus protocols such as Paxos or Raft, by providing chronological ordering and lightweight persistent snapshot. It can be also used for building log store.

Features

  • Ordered mapping of key and its value on disk (file system). Both key and value are arbitrary length binary.
  • Monotonically increasing sequence number for each key-value modification.
  • Point lookup on both key and sequence number.
  • Range lookup on both key and sequence number, by using iterator:
    • Snapshot isolation: each individual iterator is a snapshot.
    • Bi-directional traversal and jump: prev, next, gotoBegin, gotoEnd, and seek.
  • Lightweight persistent snapshot, based on sequence number:
    • Nearly no overhead for the creation of a snapshot.
    • Snapshots are durable; preserved even after process restart.
  • Tunable configurations:
    • The number of threads for log flushing and compaction.
    • Custom size ratio between LSM levels.
    • Compaction factor (please refer to the paper).
  • Log store mode:
    • Ordered mapping of sequence number and value, eliminating key indexing.
    • Lightweight log truncation based on sequence number.

Things we DO NOT (and also WILL NOT) support

  • Secondary indexing, or SQL-like query:
    • Jungle will not understand the contents of value. Value is just a binary from Jungle's point of view.
  • Server-client style service, or all other network-involving tasks such as replication:
    • Jungle is a library that should be embedded into your process.

Benefits

Compared to other widely used LSM-based key-value storage libraries, benefits of Jungle are as follows:

  • Smaller write amplification.
    • Jungle will have 4-5 times less write amplification, while providing the similar level of write performance.
  • Chronological ordering of key-value pairs
    • Along with persistent logical snapshot, this feature is very useful when you use it as a replicated state machine for Paxos or Raft.

How to Build

1. Install cmake:

  • Ubuntu
$ sudo apt-get install cmake
  • OSX
$ brew install cmake

2. Build

jungle$ ./prepare.sh -j8
jungle$ mkdir build
jungle$ cd build
jungle/build$ cmake ../
jungle/build$ make

Run unit tests:

jungle/build$ ./runtests.sh

How to Use

Please refer to this document.

Example Implementation

Please refer to examples.

Supported Platforms

  • Ubuntu (tested on 14.04, 16.04, and 18.04)
  • Centos (tested on 7)
  • OSX (tested on 10.13 and 10.14)

Platforms will be supported in the future

  • Windows

Contributing to This Project

We welcome contributions. If you find any bugs, potential flaws and edge cases, improvements, new feature suggestions or discussions, please submit issues or pull requests.

Contact

Coding Convention

  • Recommended not to exceed 90 characters per line.
  • Indent: 4 spaces, K&R (1TBS).
  • Class & struct name: UpperCamelCase.
  • Member function and member variable name: lowerCamelCase.
  • Local variable, helper function, and parameter name: snake_case.
class MyClass {
public:
    void myFunction(int my_parameter) {
        int local_var = my_parameter + 1;
        if (local_var < myVariable) {
            // ...
        } else {
            // ...
        }
    }
private:
    int myVariable;
};

int helper_function() {
    return 0;
}
  • Header include order: local to global.
    1. Header file corresponding to this source file (if applicable).
    2. Header files in the same project (i.e., Jungle).
    3. Header files from the other projects.
    4. C++ system header files.
    5. C system header files.
    • Note: alphabetical order within the same category.
    • Example (my_file.cc):
#include "my_file.h"            // Corresponding header file.

#include "table_file.h"         // Header files in the same project.
#include "table_helper.h"

#include "forestdb.h"           // Header files from the other projects.

#include <cassert>              // C++ header files.
#include <iostream>
#include <vector>

#include <sys/stat.h>           // C header files.
#include <sys/types.h>
#include <unistd.h>

License Information

Copyright 2017-present eBay Inc.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

3rd Party Code

  1. URL: https://github.com/couchbase/forestdb
    License: https://github.com/couchbase/forestdb/blob/master/LICENSE
    Originally licensed under the Apache 2.0 license.

  2. URL: https://github.com/stbrumme/crc32
    Original Copyright 2011-2016 Stephan Brumme
    See Original ZLib License: https://github.com/stbrumme/crc32/blob/master/LICENSE

  3. URL: https://github.com/greensky00/simple_logger
    License: https://github.com/greensky00/simple_logger/blob/master/LICENSE
    Originally licensed under the MIT license.

  4. URL: https://github.com/greensky00/testsuite
    License: https://github.com/greensky00/testsuite/blob/master/LICENSE
    Originally licensed under the MIT license.

  5. URL: https://github.com/greensky00/latency-collector
    License: https://github.com/greensky00/latency-collector/blob/master/LICENSE
    Originally licensed under the MIT license.

  6. URL: https://github.com/eriwen/lcov-to-cobertura-xml/blob/master/lcov_cobertura/lcov_cobertura.py
    License: https://github.com/eriwen/lcov-to-cobertura-xml/blob/master/LICENSE
    Copyright 2011-2012 Eric Wendelin
    Originally licensed under the Apache 2.0 license.

  7. URL: https://github.com/bilke/cmake-modules
    License: https://github.com/bilke/cmake-modules/blob/master/LICENSE_1_0.txt
    Copyright 2012-2017 Lars Bilke
    Originally licensed under the BSD license.

  8. URL: https://github.com/aappleby/smhasher/tree/master/src
    Copyright 2016 Austin Appleby
    Originally licensed under the MIT license.

About

An embedded key-value store library specialized for building state machine and log store

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 96.4%
  • CMake 1.4%
  • C 1.1%
  • Other 1.1%