Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BAAS-28253: Add optional tick tracking at the function level #115

Merged
merged 18 commits into from
Feb 12, 2024
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions runtime.go
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,9 @@
limiter *rate.Limiter
limiterTicksLeft int
ticks uint64

tickMetricTrackingEnabled bool
tickMetrics map[string]uint64
}

func (self *Runtime) Ticks() uint64 {
Expand Down Expand Up @@ -489,7 +492,9 @@
}
r.vm.init()

r.tickMetrics = make(map[string]uint64)
Copy link
Author

@Calvinnix Calvinnix Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to make this map get initialized per the functionTickTrackingEnabled setting but this actual would cause a problem (as demonstrated by the test). We need to have the vm object in order to enable the tick tracking but we also check if it is enabled when we create the vm object. So I think its better to just always initialize this map.

			vm := New()
			if tc.functionTickTrackingEnabled {
				vm.EnableFunctionTickTracking()
			}

(^ snippet from the unit test)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[opt] I would document this in code


r.SetRateLimiter(rate.NewLimiter(rate.Inf, maxInt))

Check failure on line 497 in runtime.go

View workflow job for this annotation

GitHub Actions / test (1.16.x, ubuntu-latest, 386)

constant 9007199254740992 overflows int
}

func (r *Runtime) typeErrorResult(throw bool, args ...interface{}) {
Expand Down Expand Up @@ -1528,6 +1533,7 @@
vm.clearStack()
} else {
vm.stack = nil
vm.profileTicks()
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this we would miss the ticks tracked for the above vm.runTry

	ex := vm.runTry(r.vm.ctx)
	if ex == nil {
		result = r.vm.result
	} else {
		err = ex
	}
	if recursive {
		vm.popCtx()
		vm.halt = false
		vm.clearStack()
	} else {
		vm.stack = nil
		vm.profileTicks()
		vm.prg = nil
		vm.setFuncName("")
		r.leave()
	}

It could turn out that this isn't necessary because it is captured by the pushCtx/restoreCtx, but I don't think it hurts to have this in place just to make sure we capturing any lingering ticks for the program. Worst case scenario is there are no new ticks and we add 0 for the program.

If we think this clutters the code I can dig more into this and see if it is necessary.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah without this we would miss a few ticks in a few edge cases, not the end of the world but probably nice to keep around.

image

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping it works for me.

vm.prg = nil
vm.setFuncName("")
r.leave()
Expand Down Expand Up @@ -3229,3 +3235,17 @@
}
return defProto
}

func (self *Runtime) EnableTickMetricTracking() {
self.tickMetricTrackingEnabled = true
}

func (self *Runtime) DisableTickMetricTracking() {
self.tickMetricTrackingEnabled = false
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add a comment for these exported functions?


// TickMetrics returns a map of ticks used per function.
// This function is not thread-safe and should only be called at the end of the function execution.
func (self *Runtime) TickMetrics() map[string]uint64 {
return self.tickMetrics
}
20 changes: 20 additions & 0 deletions vm.go
Original file line number Diff line number Diff line change
Expand Up @@ -306,12 +306,30 @@ type vm struct {
interrupted uint32
interruptVal interface{}
interruptLock sync.Mutex

lastFunctionTicks uint64
}

type instruction interface {
exec(*vm)
}

// profileTicks tracks the ticks for the current Program
func (vm *vm) profileTicks() {
if !vm.r.tickMetricTrackingEnabled {
return
}
currentTicks := vm.r.Ticks()
if vm.prg != nil {
function := string(vm.prg.funcName)
if vm.prg.src != nil {
function = vm.prg.src.Name() + "." + function
}
vm.r.tickMetrics[function] += currentTicks - vm.lastFunctionTicks
}
vm.lastFunctionTicks = currentTicks
}

func (vm *vm) getFuncName() unistring.String {
vm.funcNameLock.RLock()
s := vm.funcName
Expand Down Expand Up @@ -857,6 +875,7 @@ func (vm *vm) saveCtx(ctx *vmContext) {
}

func (vm *vm) pushCtx() {
vm.profileTicks()
if len(vm.callStack) > vm.maxCallStackSize {
panic(rangeError("Maximum call stack size exceeded"))
}
Expand All @@ -866,6 +885,7 @@ func (vm *vm) pushCtx() {
}

func (vm *vm) restoreCtx(ctx *vmContext) {
vm.profileTicks()
vm.prg, vm.stash, vm.privEnv, vm.newTarget, vm.result, vm.pc, vm.sb, vm.args =
ctx.prg, ctx.stash, ctx.privEnv, ctx.newTarget, ctx.result, ctx.pc, ctx.sb, ctx.args

Expand Down
85 changes: 85 additions & 0 deletions vm_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -594,3 +594,88 @@ func TestStashMemUsage(t *testing.T) {
})
}
}

func TestTickTracking(t *testing.T) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] The one test I'm missing here is a function that throws an error. I was playing around with popCtx vs restoreCtx and most of the times profiling from either yields the same results. The only exception was when I was using it inside of the restoreCtx (like you are in this PR) where I noticed different ticks values when my function was throwing an error. Just thought it would be useful to have that case here. Realistically it will pass the test anyway and we won't see much different but profiling from popCtx only was missing out on some ticks.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I added a test to cover that we track ticks for a function that errors. 👍

tests := []struct {
name string
script string
functionTickTrackingEnabled bool
expectedTickMetricsKeys []string
}{
{
name: "should track looping function ticks when tick tracking is enabled",
script: `
function f() {
for (var i = 0; i < 100; i++) {
}
}
f()
`,
functionTickTrackingEnabled: true,
expectedTickMetricsKeys: []string{"test.js.", "test.js.f"},
},
{
name: "should track larger looping function ticks when tick tracking is enabled",
script: `
function f() {
for (var i = 0; i < 1000; i++) {
}
}
f()
`,
functionTickTrackingEnabled: true,
expectedTickMetricsKeys: []string{"test.js.", "test.js.f"},
},
{
name: "should track fib function ticks when tick tracking is enabled",
script: `
function fib(n) {
if (n < 2) return n;
return fib(n - 2) + fib(n - 1);
}
fib(35);
`,
functionTickTrackingEnabled: true,
expectedTickMetricsKeys: []string{"test.js.", "test.js.fib"},
},
{
name: "should not track function ticks when tick tracking is disabled",
script: `
function f() {
for (var i = 0; i < 100; i++) {
}
}
f()
`,
functionTickTrackingEnabled: false,
expectedTickMetricsKeys: []string{},
},
}

for _, tc := range tests {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[q1]can we add some more tests such as executing executing class functions and multiple nested functions?

[q2] Also, do you mind sharing the metrics that we would generate from baas when using a dependency such as the aws sdk? The easiest way to do this might be to run our dependencies tests in evergreen with the tracking enabled.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[q1]can we add some more tests such as executing executing class functions and multiple nested functions?

Yeah absolutely!

[q2] Also, do you mind sharing the metrics that we would generate from baas when using a dependency such as the aws sdk? The easiest way to do this might be to run our dependencies tests in evergreen with the tracking enabled.

Would that data be in splunk? Or would we need to parse the output of the tests?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Evergreen task with tick metric tracking hardcoded to true. Will update this thread with the results.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I'm curious to see if the current approach makes it easy enough for us to identify which function is consuming the most amount of this. Let me know when this is ready.

Copy link
Author

@Calvinnix Calvinnix Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm for some reason the logs aren't showing the "tick metrics" logs for that evergreen run.

I think my local testing might alleviate some of your concerns though.

exports = async function(arg){
  const { Buffer } = require('buffer');
  const loremIpsumText = // long string
  return Buffer.from(loremIpsumText);
};

image
[Note that the tickMetricsToLog array is sorted so we are looking at the top 5 tick contributors]

The majority of usage for this Buffer.from example is BufferJS::utf8ToBytes and BufferJS::blitBuffer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll also callout that these tick metrics are flat (not cumulative) in that we are measuring the specific time only for that function.

So in this case we would know that we would want to have a native implementation in Go for utf8ToBytes and probably blitBuffer as well, which is nice because this allows us to not have to potentially waste time porting over the entire Buffer.from implementation.

Copy link
Author

@Calvinnix Calvinnix Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm for some reason the logs aren't showing the "tick metrics" logs for that evergreen run.

I have a feeling this is happening because of how tick metrics are aggregated and published asynchronously. It doesn't even look like the cmd/server/main.go executes which is where this asynchronous logic would start.

I went ahead and set this up manually though, here are the results when running against aws-sdk-v3/s3

image

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmac lol

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh nice work, hello mr Buffer not surprised to see you up there!

t.Run(tc.name, func(t *testing.T) {
vm := New()
if tc.functionTickTrackingEnabled {
vm.EnableTickMetricTracking()
}

prg := MustCompile("test.js", tc.script, false)
_, err := vm.RunProgram(prg)
if err != nil {
t.Fatal(err)
}

actualTickMetrics := vm.TickMetrics()

if len(actualTickMetrics) != len(tc.expectedTickMetricsKeys) {
t.Fatalf("Unexpected tickMetrics length: Actual: %v Expected length: %v", actualTickMetrics, len(tc.expectedTickMetricsKeys))
}

for _, key := range tc.expectedTickMetricsKeys {
if actualTickMetrics[key] <= 0 {
t.Fatalf("Unexpected tickMetrics for key: %v. Expected a positive tick value but received: %v", key, actualTickMetrics[key])
}
}
})
}
}
Loading