Skip to content

Commit

Permalink
Leasing implementation & source code refactorizations (#23)
Browse files Browse the repository at this point in the history
* staging progress on adaptability exper

* staging progress on adaptability exper

* staging progress on adaptability exper

* using ifb; finish unbalanced experiment

* finish runtime adaptability exper

* staging progress on perf breakdown

* staging progress on perf breakdown

* finish breakdown & space exper

* staging progress on rs coding eval

* finish rs coding overhead exper

* minor updates to plotting scripts

* update perf over time figures

* big update to adaptability figure

* minor updates to setup scripts

* minor updates to plotting scripts

* minor updates to plotting scripts

* minor updates to plotting scripts

* add pdf cropper helper script

* add pdf cropper helper script

* update pdf cropping script

* minor updates to plotting scripts

* minor updates to plotting scripts

* staging progress on critical path expers

* minor updates to plotting scripts

* minor updates to plotting scripts

* staging progress on critical path expers

* remove unnecessary FillHoles & Commit messages

* almost finished critical path expers

* finished critical path expers & plotting

* finished critical path expers & plotting

* minor updates to TLA+ spec

* minor updates to plotting scripts

* minor updates to plotting scripts

* fixed sim read lease bug

* minor updates to plotting scripts

* minor updates to plotting scripts

* minor updates to plotting scripts

* add distributed machines helper scripts

* minor updates to plotting scripts

* make toml dependency optional for workflow

* staging progress on physical net expers

* add tidb workload profile results

* staging progress on changed bench parameters

* update gitignore

* minor updates to tla+ formatting

* minor updates to plotting scripts

* minor updates to plotting scripts

* minor updates to plotting scripts

* minor updates to plotting scripts

* add open_tcp_ports.sh script

* add auto iperf & ping script

* staging progress on physical net expers

* staging progress on physical net expers

* staging progress on physical net expers

* finished physical network exper =)

* finished physical network exper =)

* update motivation profiling plots

* add banal formal dump script

* minor updates to plotting scripts

* add WAN delays modeling calculation

* minor updates to WAN perf modeling script

* reorganize scripts into paper-specific folders

* reorganize scripts into paper-specific folders

* mv artifact memos to separate file

* adding SMR-style TLA+ spec

* adding command line TLC helper scripts

* optimizing SMR-style TLA+ spec & linearizability constraint

* completing SMR-style TLA+ spec

* completing SMR-style TLA+ spec

* complete SMR-style MultiPaxos TLA+ spec

* add README to MultiPaxos spec folder

* add SMR-style MultiPaxos spec blog

* minor updates to MultiPaxos spec README

* update SMR-style MultiPaxos spec and TLC script

* add node failure injection to MultiPaxos spec

* update Crossword TLA+ spec

* update Crossword TLA+ spec

* update Crossword TLA+ spec .gitignore

* extending SMR-style MultiPaxos spec

* extending SMR-style MultiPaxos spec

* finish extended SMR-style MultiPaxos spec

* staging progress on Bodega TLA+ spec

* staging progress on Bodega TLA+ spec

* finish Bodega TLA+ spec

* finish Bodega TLA+ spec

* finish Bodega TLA+ spec

* finish Bodega TLA+ spec

* test change

* scanning through the codebase

* scanning through the codebase

* add back committed condition check in Bodega spec

* cleaning up bench client

* add comments on port usage mess

* remove useless perf sim parameters

* minor updates to README

* add server module sync APIs

* fix wal_offset interference bug

* refactor MultiPaxos code to fix Prepare phase bug

* comment out the snapshotting test

* fixing the Prepare phase for RSPaxos

* fixing the Prepare phase for RSPaxos

* fixed the Prepare phase for RSPaxos

* fixed the Prepare phase for Crossword

* fix formatting issues

* updating CloudLab experiments support

* updating CloudLab experiments support

* updating CloudLab experiments support

* updating CloudLab experiments support

* updating CloudLab experiments support

* updating CloudLab experiments support

* updating CloudLab experiments support

* updating CloudLab experiments support

* updating CloudLab experiments support

* add remote_hosts.toml grouping

* restructured scripts utils library

* making better benchmarking scripts

* making better benchmarking scripts

* making better benchmarking scripts

* making better benchmarking scripts

* making better benchmarking scripts

* making better benchmarking scripts

* making better benchmarking scripts

* finish follower read stalenss exper

* enable keyspace partitioning run mode

* enable keyspace partitioning for ChainPaxos

* enable keyspace partitioning for ChainPaxos

* enable keyspace partitioning for ChainPaxos

* enable keyspace partitioning for ChainPaxos

* minor updates to artifact readme

* working on improved YCSB exper

* working on improved YCSB exper

* finished improved YCSB trace exper

* finished improved YCSB trace exper

* update Crossword publish crop scripts

* wrapping up the improved evaluation

* update results archiving script

* add wan quorums model plotting for Bodega

* use k reqs unit for ycsb plot

* staging progress on chain replication impl

* staging progress on chain rep impl

* finished chain replication implementation

* finished chain replication implementation

* finished chain replication implementation

* fixing CI workflow issues

* fixing CI workflow issues

* minor update to README

* finishing final plotting updates for Crossword

* finishing final plotting updates for Crossword

* minor updates to bench script comments

* use clone_from() according to clippy

* update Bodega TLA+ model

* minor updates to cargo metadata

* modifications for code quality improvement

* modifications for code quality improvement

* use static OnceLock for logging identifier

* let formatter reorder imports automatically

* minor updates to README

* minor fix for logging & script ports

* minor updates to README

* minor updates to README

* adding public repo publish script

* adding public repo publish script

* adding public repo publish script

* adding public repo publish script

* adding public repo publish script

* adding public repo publish script

* bump rust dependencies versions

* replace messagepack with bincode

* remove unnecessary bind ports requirements

* bump tokio version and minor plot changes

* update setup scripts

* update setup scripts for cockroach

* update archive_results.sh script

* working on extra expers for reviews

* working on gossiping batching exper

* finish bw utilization exper

* finish bw utilization exper

* add cockroach automation scripts

* enabling TPC-C workload bench for cockroach

* working on cockroach tpcc experiment

* minor updates to comments

* minor updates to mirror script

* working on cockroach tpcc experiment

* working on cockroach tpcc experiment

* correct cockroach tpcc payload profiling

* correct cockroach tpcc payload profiling

* add protocol env var to cockroach script

* improve plot dimensions and spacing

* minor modifications to rscoding util

* minor updates to mirror script

* add RS coding timing logging switch

* minor updates to cockroach script

* minor updates to cockroach script

* adding settings to cockroach scripts

* minor updates to cockroach script

* adding settings to cockroach scripts

* add cockroach auto benchmark script

* finalize Crossword benchmark scripts

* cleaning up helper scripts

* cleaning up helper scripts

* cleaning up helper scripts

* adding zookeeper scripts

* enable zookeeper cluster helper scripts

* remove unnecessary home dir copy in geni scripts

* remove unnecessary home dir copy in geni scripts

* minor updates to toml parsing helper

* minor updates to toml parsing helper

* finishing up cocoroachdb experiment

* adding zookeeper client support

* added zookeeper clients script

* update and finish zookeeper clients

* adding etcd cluster scripts

* adding etcd cluster scripts

* adding etcd cluster scripts

* added etcd clients script

* update bodega tla+ gitignore

* staging progress on leasing module

* staging progress on leasing module

* staging progress on leasing module

* finish leasing module implementation

* staging progress on MultiPaxos leader leases

* staging progress on MultiPaxos leader leases

* better Summerset module tasks code structure

* better heartbeats management & fix Raft metadata serde

* staging progress on MultiPaxos leader leases

* staging progress on MultiPaxos leader leases

* staging progress on MultiPaxos leader leases

* proper mutual_leases test & improve timed unit tests

* minor updates to leaseman module printing

* typo fixes across the codebase

* typo fixes across the codebase

* fix github workflow dependencies issues

* trimming for public repo

* trimming for public repo

* trimming for public repo

---------

Co-authored-by: Guanzhou Hu <[email protected]>
  • Loading branch information
josehu07 and Guanzhou Hu authored Oct 26, 2024
1 parent 855dc5a commit eaa8738
Show file tree
Hide file tree
Showing 114 changed files with 6,208 additions and 1,853 deletions.
2 changes: 2 additions & 0 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ jobs:

steps:
- uses: actions/checkout@v3
- name: Get apt dependencies
run: sudo apt-get install -y protobuf-compiler
- name: Build
run: cargo build --workspace --verbose
- name: Add clippy component
Expand Down
4 changes: 3 additions & 1 deletion .github/workflows/tests_proc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,9 @@ jobs:

steps:
- uses: actions/checkout@v3
- name: Install 'toml' package
- name: Get apt dependencies
run: sudo apt-get install -y protobuf-compiler
- name: Get pip dependencies
run: pip3 install toml
- name: Run proc tests (MultiPaxos)
run: python3 .github/workflow_test.py -p MultiPaxos
Expand Down
4 changes: 3 additions & 1 deletion .github/workflows/tests_unit.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,7 @@ jobs:

steps:
- uses: actions/checkout@v3
- name: Get apt dependencies
run: sudo apt-get install -y protobuf-compiler
- name: Run all unit tests
run: cargo test --workspace --verbose
run: cargo test --workspace --verbose --lib --bins
40 changes: 30 additions & 10 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,26 +8,46 @@ version = "0.1.0"
edition = "2021"
authors = ["Guanzhou Hu <[email protected]>"]

[dependencies]
async-trait = "0.1"
fixedbitset = { version = "0.5", features = ["serde"] }
rangemap = "1.5"
flashmap = "0.1"
bytes = { version = "1.7", features = ["serde"] }
futures = "0.3"
tokio = { version = "1.39", features = ["full"] }
[workspace.dependencies]
tokio = { version = "1.40", features = ["full"] }
rand = "0.8"
rand_distr = "0.4"
rangemap = "1.5"
lazy_static = "1.5"
bytes = { version = "1.7", features = ["serde"] }
serde = { version = "1.0", features = ["derive"] }
bincode = "1.3"
toml = { version = "0.8", features = ["parse"] }
log = "0.4"
env_logger = "0.11"
reed-solomon-erasure = { version = "6.0" }
clap = { version = "4.0", features = ["derive"] }
ctrlc = { version = "3.4", features = ["termination"] }
color-print = { version = "0.3", features = ["terminfo"] }
zookeeper-client = "0.8"
etcd-client = "0.14"

[dependencies]
tokio = { workspace = true }
rand = { workspace = true }
rangemap = { workspace = true }
lazy_static = { workspace = true }
bytes = { workspace = true }
serde = { workspace = true }
toml = { workspace = true }
log = { workspace = true }
env_logger = { workspace = true }
async-trait = "0.1"
fixedbitset = { version = "0.5", features = ["serde"] }
flashmap = "0.1"
futures = "0.3"
bincode = "1.3"
reed-solomon-erasure = { version = "6.0" }
get-size = { version = "0.1", features = ["derive"] }
linreg = "0.2"
statistical = "1.0"
# these are just for error conversion; could do it in a better way
ctrlc = { workspace = true }
zookeeper-client = { workspace = true }
etcd-client = { workspace = true }

[dev-dependencies]
criterion = "0.5"
Expand Down
15 changes: 3 additions & 12 deletions scripts/distr_clients.py
Original file line number Diff line number Diff line change
Expand Up @@ -128,13 +128,11 @@ def run_clients(
config,
capture_stdout,
pin_cores,
base_idx,
timeout_ms,
):
if num_clients < 1:
raise ValueError(f"invalid num_clients: {num_clients}")

# assuming I am the machine to run manager
manager_pub_ip = ipaddrs[man]

# if dist_machs set, put clients round-robinly across this many machines
Expand Down Expand Up @@ -200,7 +198,7 @@ def run_clients(
"-c", "--config", type=str, help="protocol-specific TOML config string"
)
parser.add_argument(
"-g", "--group", type=str, default="1dc", help="hosts group to run on"
"-g", "--group", type=str, default="reg", help="hosts group to run on"
)
parser.add_argument(
"--me", type=str, default="host0", help="main script runner's host nickname"
Expand All @@ -211,12 +209,6 @@ def run_clients(
parser.add_argument(
"--pin_cores", type=float, default=0, help="if not 0, set CPU cores affinity"
)
parser.add_argument(
"--base_idx",
type=int,
default=0,
help="idx of the first client for calculating ports",
)
parser.add_argument(
"--timeout_ms", type=int, default=5000, help="client-side request timeout"
)
Expand Down Expand Up @@ -328,7 +320,7 @@ def run_clients(
raise ValueError(f"invalid manager oracle's host {args.man}")

# check that the partition index is valid
partition_in_args, partition, file_midfix = False, 0, args.file_midfix
partition_in_args, partition, file_midfix = False, 0, ""
if args.utility == "bench":
partition_in_args = "partition" in args
if partition_in_args and (args.partition < 0 or args.partition >= 5):
Expand Down Expand Up @@ -380,7 +372,6 @@ def run_clients(
args.config,
capture_stdout,
args.pin_cores,
args.base_idx,
args.timeout_ms,
)

Expand All @@ -407,7 +398,7 @@ def run_clients(
rcs.append(client_proc.returncode)
except subprocess.TimeoutExpired:
if args.expect_halt: # mainly for failover experiments
print("WARN: getting expected halt, exitting...")
print("WARN: getting expected halt, exiting...")
sys.exit(0)
raise RuntimeError(f"some client(s) timed-out {timeout} secs")

Expand Down
3 changes: 2 additions & 1 deletion scripts/distr_cluster.py
Original file line number Diff line number Diff line change
Expand Up @@ -285,7 +285,7 @@ def launch_servers(
"-c", "--config", type=str, help="protocol-specific TOML config string"
)
parser.add_argument(
"-g", "--group", type=str, default="1dc", help="hosts group to run on"
"-g", "--group", type=str, default="reg", help="hosts group to run on"
)
parser.add_argument(
"--me", type=str, default="host0", help="main script runner's host nickname"
Expand Down Expand Up @@ -397,6 +397,7 @@ def launch_servers(
# print_manager_t.start()

# then launch server replicas
print("Launching server processes...")
server_procs = launch_servers(
remotes,
ipaddrs,
Expand Down
10 changes: 1 addition & 9 deletions scripts/local_clients.py
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,6 @@ def run_clients(
capture_stdout,
pin_cores,
use_veth,
base_idx,
timeout_ms,
):
if num_clients < 1:
Expand Down Expand Up @@ -158,12 +157,6 @@ def run_clients(
parser.add_argument(
"--use_veth", action="store_true", help="if set, use netns and veth setting"
)
parser.add_argument(
"--base_idx",
type=int,
default=0,
help="idx of the first client for calculating ports",
)
parser.add_argument(
"--timeout_ms", type=int, default=5000, help="client-side request timeout"
)
Expand Down Expand Up @@ -281,7 +274,6 @@ def run_clients(
capture_stdout,
args.pin_cores,
args.use_veth,
args.base_idx,
args.timeout_ms,
)

Expand Down Expand Up @@ -310,7 +302,7 @@ def run_clients(
rcs.append(client_proc.returncode)
except subprocess.TimeoutExpired:
if args.expect_halt: # mainly for failover experiments
print("WARN: getting expected halt, exitting...")
print("WARN: getting expected halt, exiting...")
sys.exit(0)
raise RuntimeError(f"some client(s) timed-out {timeout} secs")

Expand Down
33 changes: 14 additions & 19 deletions scripts/remote_hosts.toml
Original file line number Diff line number Diff line change
@@ -1,23 +1,18 @@
# (DON'T CHANGE) remote base path and project repo folder name
base_path = "/eval"
base_path = "/home/smr"
repo_name = "summerset"

# (SET PROPERLY) for each group, its username @ DNS domain names
[1dc]
# (SET PROPERLY) for each group, its DNS domain names
[reg]
host0 = "domain.cloudlab.us"
host1 = "domain.cloudlab.us"
host2 = "domain.cloudlab.us"
host3 = "domain.cloudlab.us"
host4 = "domain.cloudlab.us"

host0 = "[email protected]"
host1 = "[email protected]"
host2 = "[email protected]"
# host3 = "[email protected]"
# host4 = "[email protected]"
# host5 = "[email protected]"
# host6 = "[email protected]"
# host7 = "[email protected]"
# host8 = "[email protected]"

# [wan]
# host0 = "[email protected]"
# host1 = "[email protected]"
# host2 = "[email protected]"
# host3 = "[email protected]"
# host4 = "[email protected]"
[wan]
host0 = "domain.cloudlab.us"
host1 = "domain.cloudlab.us"
host2 = "domain.cloudlab.us"
host3 = "domain.cloudlab.us"
host4 = "domain.cloudlab.us"
2 changes: 1 addition & 1 deletion scripts/remote_iperf.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ def ping_test(remotes, domains, na, nb):

parser = argparse.ArgumentParser(allow_abbrev=False)
parser.add_argument(
"-g", "--group", type=str, default="1dc", help="hosts group to run on"
"-g", "--group", type=str, default="reg", help="hosts group to run on"
)
args = parser.parse_args()

Expand Down
62 changes: 42 additions & 20 deletions scripts/remote_killall.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,34 +9,40 @@
TOML_FILENAME = "scripts/remote_hosts.toml"


def killall_on_targets(destinations, cd_dir, chain=False):
cmd = ["./scripts/kill_all_procs.sh", "incl_distr"]
if chain:
pass # placeholder line
def compose_kill_cmds(chain=False, cockroach=False, zookeeper=False, etcd=False):
cmds = [["./scripts/kill_all_procs.sh", "incl_distr"]]
return cmds

print("Running kill commands in parallel...")
procs = []
for remote in destinations:
procs.append(
utils.proc.run_process_over_ssh(
remote,
cmd,
cd_dir=cd_dir,
capture_stdout=True,
capture_stderr=True,
)
)

print("Waiting for command results...")
utils.proc.wait_parallel_procs(procs)
def killall_on_targets(
destinations, cd_dir, chain=False, cockroach=False, zookeeper=False, etcd=False
):
cmds = compose_kill_cmds(
chain=chain, cockroach=cockroach, zookeeper=zookeeper, etcd=etcd
)
for cmd in cmds:
print("Running kill commands in parallel...")
procs = []
for remote in destinations:
procs.append(
utils.proc.run_process_over_ssh(
remote,
cmd,
cd_dir=cd_dir,
capture_stdout=True,
capture_stderr=True,
)
)
print("Waiting for command results...")
utils.proc.wait_parallel_procs(procs)


if __name__ == "__main__":
utils.file.check_proper_cwd()

parser = argparse.ArgumentParser(allow_abbrev=False)
parser.add_argument(
"-g", "--group", type=str, default="1dc", help="hosts group to run on"
"-g", "--group", type=str, default="reg", help="hosts group to run on"
)
parser.add_argument(
"-t",
Expand All @@ -48,6 +54,15 @@ def killall_on_targets(destinations, cd_dir, chain=False):
parser.add_argument(
"--chain", action="store_true", help="if set, kill ChainPaxos processes"
)
parser.add_argument(
"--cockroach", action="store_true", help="if set, kill CockroachDB processes"
)
parser.add_argument(
"--zookeeper", action="store_true", help="if set, kill ZooKeeper processes"
)
parser.add_argument(
"--etcd", action="store_true", help="if set, kill etcd processes"
)
args = parser.parse_args()

base, repo, _, remotes, _, _ = utils.config.parse_toml_file(
Expand All @@ -66,4 +81,11 @@ def killall_on_targets(destinations, cd_dir, chain=False):
if len(destinations) == 0:
raise ValueError(f"targets list is empty")

killall_on_targets(destinations, f"{base}/{repo}", args.chain)
killall_on_targets(
destinations,
f"{base}/{repo}",
args.chain,
args.cockroach,
args.zookeeper,
args.etcd,
)
Loading

0 comments on commit eaa8738

Please sign in to comment.