Skip to content
This repository has been archived by the owner on May 10, 2024. It is now read-only.

artifacts pipeline and sample plots #147

Open
wants to merge 44 commits into
base: delta
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
4699ff4
k8s-scheduler: add test case that instantiates ddlog-sql
lalithsuresh Jul 29, 2021
6fc1b1b
k8s-scheduler: perform insert/update in test case
lalithsuresh Jul 29, 2021
99697fc
k8s-scheduler: fix checkstyle/spotbugs warnings
lalithsuresh Jul 30, 2021
d817727
k8s-scheduler: exclude transitive dependencies
lalithsuresh Jul 30, 2021
8b233f7
k8s-scheduler: add test case that instantiates ddlog-sql
lalithsuresh Jul 29, 2021
8c286b5
k8s-scheduler: rebase with master
lalithsuresh Aug 18, 2021
d08e052
Modify scheduler_tables.sql so that `create table` statements can be …
amytai Aug 19, 2021
8e7e5ec
DDlogDBViews and ddlog_scheduler_tables.sql updates to match the new …
amytai Sep 30, 2021
4534cef
Add identity views for each input table, so that we can query input t…
amytai Sep 30, 2021
679536b
Eliminate transitive imports from DDlog, so that we have 1 version of…
amytai Oct 7, 2021
4645809
Port topology tables and views to DDlog versions. Also add indexes
amytai Oct 7, 2021
86b351f
Changes to make DCM run reasonably with DDlog. WorkloadReplayTest tes…
amytai Nov 4, 2021
bae6e8e
Make DDlog work with Autoscope "_sorted" views
amytai Nov 9, 2021
0196d98
Premption views compile succesfully, AutoScope "_augmented" also seem…
amytai Nov 9, 2021
6a16263
Remove 2x rebuild of the DDlog program, as this causes the runtime to…
amytai Nov 10, 2021
e92067a
Some nits: timing, remove limit from Autoscope
amytai Nov 11, 2021
cf36222
Add hand-optimized DDlog programs
amytai Nov 11, 2021
03b6379
Add functionality in benchmarking code to optionally provide a hand-o…
amytai Nov 11, 2021
0b7ebe0
Move hand-optimized DDlog programs to src/main/resources
amytai Nov 11, 2021
57710c1
Modify Translator signature to match new DDlog Java API
amytai Nov 15, 2021
e01aca7
Add command-line flags to Scheduler::main to optionally use DDlog bac…
amytai Nov 15, 2021
0a33316
k8s-scheduler: prefix all DB dumps with debug_
lalithsuresh Nov 16, 2021
676c865
k8s-scheduler: produce a DB dump if we fallback to preemption
lalithsuresh Nov 16, 2021
5c9dc09
k8s-scheduler: fix debug-mode flag
lalithsuresh Nov 16, 2021
7595eca
k8s-scheduler: use select * form in DebugUtils.dbDump()
lalithsuresh Nov 16, 2021
3fe428b
k8s-scheduler: fix checkstyle and spotbugs issues
lalithsuresh Nov 16, 2021
8cb154a
k8s-scheduler: re-order test arguments to reduce compilation times
lalithsuresh Nov 16, 2021
2b0cc5b
At the beginning of SchedulerTest, compile the DDlog program w Scoped…
amytai Nov 17, 2021
3416c87
Updates to make most tests in SchedulerTest pass with DDlog backend
amytai Nov 17, 2021
c316205
k8-scheduler: checkpoint
lalithsuresh Nov 18, 2021
0d2b066
k8-scheduler: temporary workaround for spare capacity while ddlog-lef…
lalithsuresh Nov 18, 2021
5a02a96
k8s-scheduler: workaround until DDLJP.fetchTable() works correctly wi…
lalithsuresh Nov 18, 2021
d256534
k8s-scheduler: requeue() should update by pod.uid. testRequeue() now …
lalithsuresh Nov 18, 2021
c503641
Add LIMIT back into pod view
amytai Nov 18, 2021
040e788
k8s-scheduler: avoid some circular dependencies
lalithsuresh Nov 19, 2021
6750477
k8s-scheduler: simplify DDlog connection initialization
lalithsuresh Nov 19, 2021
60f3c23
k8s-scheduler: add findbugs exclusion because of false positive
lalithsuresh Nov 19, 2021
37999ec
k8s-scheduler: pass all tests in SchedulerTest except testPreempt()
lalithsuresh Nov 19, 2021
e640bbd
k8s-scheduler: make testPreempt() pass
lalithsuresh Nov 19, 2021
40fd3ec
k8s-scheduler: fix some checkstyle issues and remaining ScopeTest tests
lalithsuresh Nov 19, 2021
5e62be4
artifacts pipeline and sample plots
askiad Nov 17, 2021
160c200
removing sample artifact outputs
askiad Nov 18, 2021
7e16304
adding side-by-side barplots for different schedulers
askiad Nov 20, 2021
2d4fd1f
restore SchedulerTest
askiad Nov 20, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions artifact/generate_trace.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
from random import randrange

GROUPS = 5
GROUP_SIZE = 20
TIME_STEP = 300

with open("generated-data.txt", "w") as f:
start = 0
end = 500

for _ in range(GROUPS):
for _ in range(GROUP_SIZE):
ida = 1 + randrange(10000)
idb = 1 + randrange(10000)
cpu = 1 + randrange(8)
mem = 1 + randrange(32)
vms = 1 + randrange(100)

parts = [ida, idb, start, end, cpu, mem, vms]
line = " ".join(str(x) for x in parts)
f.write(line + '\n')

start += TIME_STEP
end += TIME_STEP
339 changes: 183 additions & 156 deletions artifact/plot.r

Large diffs are not rendered by default.

215 changes: 154 additions & 61 deletions artifact/process_trace.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#!/bin/python
#!/usr/bin/python

import sys
import glob
Expand All @@ -12,17 +12,17 @@
sys.exit(1)

folderToOpen = sys.argv[1]
traceFolders = glob.glob(folderToOpen + "/*/")
database = sys.argv[2]
conn = sqlite3.connect(database)
traceFolders = glob.glob(folderToOpen + "/*/*/")

conn.execute("drop table if exists params")
conn.execute("drop table if exists pods_over_time")
conn.execute("drop table if exists pod_events")
conn.execute("drop table if exists scheduler_trace")
conn.execute("drop table if exists dcm_table_access_latency")
conn.execute("drop table if exists dcm_metrics")
conn.execute("drop table if exists scope_fraction")
conn.execute("drop table if exists event_trace")
conn.execute("drop table if exists problem_size")

conn.execute('''
create table params
Expand All @@ -32,25 +32,24 @@
scheduler varchar(100) not null,
solver varchar(100) not null,
kubeconfig integer not null,
dcmGitBranch integer not null,
dcmGitCommitId varchar(100) not null,
numNodes integer not null,
startTimeCutOff integer not null,
percentageOfNodesToScoreValue integer not null,
timeScaleDown integer not null,
scenario integer not null,
affinityProportion integer not null
)
''')

conn.execute('''
create table pods_over_time
create table pod_events
(
expId integer not null,
batchId integer not null,
podName varchar(100) not null,
status varchar(100) not null,
uid varchar(100) not null,
event varchar(100) not null,
eventTime integer not null,
nodeName varchar(100) not null,
foreign key(expId) references params(expId)
)
''')
Expand All @@ -66,7 +65,6 @@
)
''')


conn.execute('''
create table dcm_table_access_latency
(
Expand All @@ -79,6 +77,7 @@
)
''')

# FIXME: missing scope latency
conn.execute('''
create table dcm_metrics
(
Expand All @@ -102,6 +101,7 @@
)
''')

# FIXME: missing scope fraction
conn.execute('''
create table scope_fraction
(
Expand All @@ -112,6 +112,7 @@
)
''')


def convertRunningSinceToSeconds(runningSince):
minutes = 0
if ("m" in runningSince):
Expand Down Expand Up @@ -150,7 +151,7 @@ def convertK8sTimestamp(k8stimestamp):
paramFiles = glob.glob(trace + "metadata")
assert len(paramFiles) == 1

print("experiment ID: ", end='')
print("Tarce processor running for experiment ID: ", end='')
print(expId)

# Add params file in addition
Expand All @@ -167,36 +168,21 @@ def convertK8sTimestamp(k8stimestamp):
expParams["schedulerName"],
expParams["solver"],
expParams["kubeconfig"],
expParams["dcmGitBranch"],
expParams["dcmGitCommitId"],
expParams["numNodes"],
expParams["startTimeCutOff"],
expParams["percentageOfNodesToScoreValue"],
expParams["timeScaleDown"],
expParams["scenario"],
expParams["affinityProportion"]))


# Get the list of pods
pods = {}
with open(trace + "/workload_output") as workloadOutput:
for line in workloadOutput:
if ("org.dcm.WorkloadGeneratorIT" in line and "PodName" in line):
splitLine = line.split("org.dcm.WorkloadGeneratorIT")[-1].split(" ")
eventTime = splitLine[3][:-1]
podName = splitLine[7][:-1]
nodeName = splitLine[9][:-1]
status = splitLine[11][:-1]
event = splitLine[13].strip()
conn.execute("insert into pods_over_time values (?, ?, ?, ?, ?, ?)",
(expParams["expId"], podName, status, event, eventTime, nodeName))

dcmSchedulerFile = glob.glob(trace + "/dcm_scheduler_trace")
if (len(dcmSchedulerFile) > 0):
with open(dcmSchedulerFile[0]) as traceFile:
trace = glob.glob(trace + "/trace")
if (len(trace) > 0):
with open(trace[0]) as traceFile:
batchId = 1
metrics = {}
metrics["scopeLatency"] = 0 # when scope is not used
variablesBeforePresolve = False
variablesBeforePresolve = True
for line in traceFile:
if ("Fetchcount is" in line):
metrics["fetchcount"] = line.split()[-1]
Expand All @@ -206,7 +192,6 @@ def convertK8sTimestamp(k8stimestamp):
tableName = split[8]
latencyToDb = split[10]
latencyToReflect = split[19]

conn.execute("insert into dcm_table_access_latency values (?, ?, ?, ?, ?)",
(expParams["expId"], batchId, tableName, latencyToDb, latencyToReflect))

Expand All @@ -233,7 +218,7 @@ def convertK8sTimestamp(k8stimestamp):
variablesBeforePresolve = True
metrics["variablesBeforePresolve"] = numVariablesTotal
metrics["variablesBeforePresolveObjective"] = numVariablesObjective

elif ("#Variables" in line and variablesBeforePresolve == True):
split = line.split()
numVariablesTotal = split[1]
Expand Down Expand Up @@ -295,38 +280,146 @@ def convertK8sTimestamp(k8stimestamp):
metrics["databaseLatencyTotal"]
))

if ("Adding pod" in line):
split = line.split("Adding pod")[-1].split()
pod = split[0]
uid = split[2].split(',')[0]
event = "ADD"
conn.execute("insert into pod_events values (?, ?, ?, ?, ?)",
(expParams["expId"], batchId, pod, uid, event))

if ("Received stale event for pod that we already deleted:" in line):
split = line.split("Received stale event for pod that we already deleted")[-1].split()
pod = split[0]
uid = split[2].split(',')[0]
event = "IGNORE_ON_ADD_ALREADY_DELETED"
conn.execute("insert into pod_events values (?, ?, ?, ?, ?)",
(expParams["expId"], batchId, pod, uid, event))

if ("Deleting pod" in line):
split = line.split("Deleting pod")[-1].split()
pod = split[0]
uid = split[2].split(',')[0]
event = "DELETE"
conn.execute("insert into pod_events values (?, ?, ?, ?, ?)",
(expParams["expId"], batchId, pod, uid, event))

if ("does not exist. Skipping" in line):
split = line.split("does not exist. Skipping")[0].split()
pod = split[-3]
uid = split[-1].split(')')[0]
event = "IGNORE_ON_UPDATE_NO_EXIST"
conn.execute("insert into pod_events values (?, ?, ?, ?, ?)",
(expParams["expId"], batchId, pod, uid, event))

if ("Received a stale pod event" in line):
split = line.split("Received a stale pod event")[-1].split()
pod = split[0]
uid = split[2].split(',')[0]
event = "IGNORE_ON_UPDATE_STALE"
conn.execute("insert into pod_events values (?, ?, ?, ?, ?)",
(expParams["expId"], batchId, pod, uid, event))

if ("Received a duplicate event for a node that we have already scheduled" in line):
split = line.split("have already scheduled")[-1].split()
pod = split[3].split(')')[0]
uid = ""
event = "IGNORE_ON_UPDATE_DUPLICATE"
conn.execute("insert into pod_events values (?, ?, ?, ?, ?)",
(expParams["expId"], batchId, pod, uid, event))

if ("Received stale event for pod that we already deleted:" in line):
split = line.split("we already deleted")[-1].split()
pod = split[0]
uid = split[2].split(',')[0]
event = "IGNORE_ON_UPDATE_ALREADY_DELETED"
conn.execute("insert into pod_events values (?, ?, ?, ?, ?)",
(expParams["expId"], batchId, pod, uid, event))

if ("Updating pod" in line):
split = line.split("Updating pod")[-1].split()
pod = split[0]
uid = split[2].split(',')[0]
event = "UPDATE"
conn.execute("insert into pod_events values (?, ?, ?, ?, ?)",
(expParams["expId"], batchId, pod, uid, event))

if ("Insert/Update pod" in line):
split = line.split("Insert/Update pod")[-1].split()
pod = split[1]
uid = split[0].split(',')[0]
event = "INSERT/UPDATE"
conn.execute("insert into pod_events values (?, ?, ?, ?, ?)",
(expParams["expId"], batchId, pod, uid, event))

if ("Attempting to bind" in line):
split = line.split(":")[-1].split()
pod = split[0]
uid = ""
event = "BIND_ATTEMPT"
conn.execute("insert into pod_events values (?, ?, ?, ?, ?)",
(expParams["expId"], batchId, pod, uid, event))

if ("Binding" in line):
split = line.split("pod:")[-2].split()
pod = split[0]
uid = split[2].split(')')[0]
event = "BIND"
conn.execute("insert into pod_events values (?, ?, ?, ?, ?)",
(expParams["expId"], batchId, pod, uid, event))

if ("Attempting to delete" in line):
split = line.split(":")[-1].split()
pod = split[0]
uid = ""
event = "DELETE_ATTEMPT"
conn.execute("insert into pod_events values (?, ?, ?, ?, ?)",
(expParams["expId"], batchId, pod, uid, event))

if ("Delete" in line):
split = line.split("pod:")[-2].split()
pod = split[0]
uid = split[2].split(')')[0]
event = "DELETE"
conn.execute("insert into pod_events values (?, ?, ?, ?, ?)",
(expParams["expId"], batchId, pod, uid, event))

if ("Attempting Preemption" in line):
split = line.split("pod:")[-1].split()
pod = split[0]
uid = ""
event = "PREEMPT_ATTEMPT"
conn.execute("insert into pod_events values (?, ?, ?, ?, ?)",
(expParams["expId"], batchId, pod, uid, event))

if ("will be preempted" in line):
split = line.split("pod:")[-1].split()
pod = split[0]
uid = ""
event = "PREEMPT"
conn.execute("insert into pod_events values (?, ?, ?, ?, ?)",
(expParams["expId"], batchId, pod, uid, event))

if ("could not be assigned a node even with preemption" in line):
split = line.split("pod:")[-1].split()
pod = split[0]
uid = ""
event = "PREEMPT_FAIL"
conn.execute("insert into pod_events values (?, ?, ?, ?, ?)",
(expParams["expId"], batchId, pod, uid, event))



if ("Scheduling decision for pod" in line):
split = line.split("Scheduling decision for pod")[-1].split()
pod, batch, bindTime = split[0], split[5], split[-1]

pod = split[0]
batch = split[5]
bindTime = split[-1]
event = "SCHEDULE"
conn.execute("insert into scheduler_trace values (?, ?, ?, ?)",
(expParams["expId"], pod, bindTime, batch))
conn.execute("insert into pod_events values (?, ?, ?, ?, ?)",
(expParams["expId"], batchId, pod, uid, event))
batchId = int(batch) + 1


defaultSchedulerFile = glob.glob(trace + "/default_scheduler_trace")
batchId = 0
if (len(defaultSchedulerFile) > 0):
with open(defaultSchedulerFile[0]) as traceFile:
for line in traceFile:
if ("About to try and schedule pod" in line):
split = line.split()
startTime, pod = split[0], split[-1]
pod = pod.split("/")[-1] # pods here are named <namespace>/<podname>
if (pod not in pods):
pods[pod] = {}
pods[pod]["startTime"] = convertK8sTimestamp(startTime)
pods[pod]["startLine"] = split

if ("Attempting to bind" in line):
split = line.split()
bindTime, pod = split[0], split[-3]
assert pod in pods
pods[pod]["bindTime"] = convertK8sTimestamp(bindTime)
pods[pod]["endLine"] = split
assert pods[pod]["bindTime"] > pods[pod]["startTime"], pods[pod]
conn.execute("insert into scheduler_trace values (?, ?, ?, ?)",
(expParams["expId"], pod, pods[pod]["bindTime"] - pods[pod]["startTime"],
batchId))
batchId += 1
conn.commit()
Loading