-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial draft of scenario interface. #70
base: master
Are you sure you want to change the base?
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,116 @@ | ||
syntax = "proto3"; | ||
package proto; | ||
|
||
message Scenario { | ||
Load load = 1; | ||
Cluster cluster = 2; | ||
Workload workload = 3; | ||
repeated Plugin plugin = 4; | ||
} | ||
|
||
message Load { | ||
oneof arrival_rate_oneof { | ||
LoadConstant constant = 1; | ||
LoadPeriodic periodic = 3; | ||
LoadCustom custom = 3; | ||
} | ||
oneof request_weight_oneof { | ||
RequestConstant constant = 4; | ||
} | ||
} | ||
|
||
message LoadConstant { | ||
int32 arrivals_per_second = 1; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What's the difference between 'arrival' and 'load'? Would it make sense to stick to one term, e.g. load_rate / load_per_second ? Alternatively, I'd go with query_per_second nomenclature, as QPS is a better known term (I think) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So an "arrival" is a "request". But I wanted to avoid http oriented language to allow for pub-sub use cases. "load" is the pattern of arrival / request. How about sticking to the terms "load" and "operation"? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've seen QPS being used in pubsub context, so I wouldn't worry about that. That said, I don't feel strongly, if you prefer 'arrivals', that sounds fine to me. |
||
} | ||
|
||
message LoadPeriodic { | ||
oneof shape_oneof { | ||
ShapeSine sine = 1; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wouldn't be sure how to interpret semantics of Sine and Asdr, e.g. how exactly should I interpret width_seconds / amplitude for Sin, will they repeat, for how long etc. I hope comments will clarify these things. P.S. Might be better to define Sine fields directly for clarity, rather than trying to reuse LoadInterval structure. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, it seems awkward. I'll change sine to a frequency / amplitude struct. Or I could drop it entirely. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sine is still a useful approximation for load patterns with seasonality (which is basically all of them). It has the nice property that you can more easily look for resonant frequencies at which an autoscaler misbehaves. |
||
ShapeAsdr asdr = 2; | ||
} | ||
} | ||
|
||
message ShapeSine { | ||
LoadInterval interval = 1; | ||
} | ||
|
||
message ShapeAsdr { | ||
jchesterpivotal marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is ASDR case so common and helpful? Couldn't I express the same by putting a bunch of LoadInterval into LoadCustom? (seems simpler and more generic to me, and I wouldn't be constrained to ASDR model, and I couldn't abuse ASDR model by defining, for example, decay higher than attack) P.S. Is it ASDR or ADSR? Wikipedia calls it ADSR per Jacques's link: https://en.wikipedia.org/wiki/Envelope_(music). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was shooting for a middle ground between a constant value and fully custom values. ADSR is periodic, so you define the wave shape and it repeats. It's nice for describing something simple like "I have a spike every hour on the hour." With custom, you have a lay out as many points as the simulation is long. This is the escape hatch for all load patterns we can't express. ADSR subsumes step and ramp patterns currently in Skenario. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see. If it's periodic, I can see how it can be more convenient than custom. |
||
LoadInterval initial = 1; | ||
LoadInterval attack = 2; | ||
LoadInterval sustain = 3; | ||
LoadInterval decay = 4; | ||
LoadInterval release = 5; | ||
} | ||
|
||
message LoadCustom { | ||
repeated LoadInterval interval = 1; | ||
} | ||
|
||
message LoadInterval { | ||
int32 width_seconds = 1; | ||
int32 amplitude_aps = 2; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. APS? Is this a typo for QPS (the google shorthand as I understand it)? If so, should it be in per-second units? I can see that it's easier to approach from how folks usually express load, but it also means that we'll be internally doing a I feel like it would be easier to give a total number of load units (requests, queries). That does put it back on the user to perform a calculation but it means that they don't have to perform any calculations to recover the total number. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was calling it "arrivals per second" because I want to remain open to pub-sub use-cases. But "operations per second" would also work. Would that be more intuitive? (QPS - "queries per second" is probably not the best choice) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, I think you're right. Arrivals/sec is a better unit. Perhaps rename as |
||
} | ||
|
||
message RequestConstant { | ||
int32 cpu_milliseconds = 1; | ||
int32 io_milliseconds = 2; | ||
int32 memory_bytes = 3; | ||
} | ||
|
||
message Cluster { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You could also skip for now, per https://en.wikipedia.org/wiki/You_aren%27t_gonna_need_it We can always add later once we know what exactly we need. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Touche. I'll drop it like a hot potato. |
||
// infinite capacity | ||
} | ||
|
||
message Workload { | ||
int32 readiness_delay_seconds = 1; | ||
int32 termination_delay_seconds = 2; | ||
} | ||
|
||
message Plugin { | ||
string type = 1; | ||
int32 version = 2; | ||
PluginPoint point = 3; | ||
Any configuration = 4; | ||
} | ||
|
||
enum PluginPoint { | ||
HORIZONTAL_AUTOSCALING = 0; | ||
VERTICAL_AUTOSCALING = 1; | ||
REQUEST_ROUTING = 2; | ||
} | ||
|
||
message Result { | ||
repeated ResultMetric metric = 1; | ||
} | ||
|
||
message Metric { | ||
MetricType type = 1; | ||
MetricResource resource = 2 | ||
repeated MetricPoint point = 3; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm confused by combination of statistics (average, P50, P90) and time series ("repeated MetricPoint"). For example, what would it mean to result 10 MetricPoint with type P99? Also, how does the system know which statistics are requested? The way I'd see it is that you should only have a timeseries in the result, maybe with requested resolution. The caller can then do whatever they want with these timeseries, e.g. draw, compute custom percentiles etc. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The MetricPoints vary the time and value. For example P50 latency might be 1 second at T0, 2 seconds at T1, etc... Metric is a typed metric stream covering the duration of the simulation. |
||
} | ||
|
||
enum MetricType { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd suggest we call this |
||
COUNT = 0; | ||
AVERAGE = 1; | ||
jchesterpivotal marked this conversation as resolved.
Show resolved
Hide resolved
|
||
P50 = 2; | ||
P90 = 3; | ||
P99 = 4; | ||
P100 = 5; | ||
} | ||
|
||
enum MetricResource { | ||
CPU_USAGE_MILLIS = 0; | ||
CPU_CAPACITY_MILLIS = 1; | ||
MEMORY_USAGE_BYTES = 2; | ||
MEMORY_CAPACITY_BYTES = 3; | ||
WORK_ARRIVAL = 3; | ||
WORK_SUCCESS = 4; | ||
WORK_FAILURE = 5; | ||
LATENCY_MILLIS = 6; | ||
POD_COUNT = 7; | ||
} | ||
|
||
message MetricPoint { | ||
int32 time_millis = 1; | ||
float64 value = 2; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we document each field? (there are no comments)
I'd suggest erring on the side of each field having a comment, even if some seem more clear. It will help follow what is what (e.g. 'attack', 'sustain', 'decay')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, absolutely.