Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Improve speed and reduce allocations when adding fields #402

Closed
wants to merge 10 commits into from

Conversation

MikeGoldsmith
Copy link
Contributor

@MikeGoldsmith MikeGoldsmith commented Nov 27, 2023

Which problem is this PR solving?

The cost of adding the app. prefix to all field names is expensive and can account for a significant amount of allocations as observed by perf tools. This is down to always doing a string manipulation for every field that's being added to a span or trace.

Short description of the changes

  • Add map and read-write lock to managed the cached fields
  • Add getPrefixedFieldName func that checks does the following:
    • if the key already has the "app." prefix, return it
    • look in cache to see if a prefixed version of the key already exists and return it
    • before creating the new cache entry, if the cache is > 1000 entries, clear the cache
    • add a new entry to the cache and return it
  • Add new benchmarks for adding fields when using a key that already has a prefix and when the key is always unique

New benchmarks:

BenchmarkBeelineAddField_PrefixedKey-10      	25759432	46.06 ns/op	0 B/op		0 allocs/op
BenchmarkBeelineAddField_ConsistentKey-10    	21309750	55.63 ns/op	0 B/op		0 allocs/op
BenchmarkBeelineAddField_InconsistentKey-10   	2997980		433.1 ns/op	300 B/op	2 allocs/op

Previous benchmarks (equivalent to consistent key above):

BenchmarkBeelineAddField-10    	17576912	        70.22 ns/op	8 B/op	1 allocs/op

@MikeGoldsmith MikeGoldsmith added type: enhancement New feature or request version: bump minor A PR that adds behavior, but is backwards-compatible. labels Nov 27, 2023
@MikeGoldsmith MikeGoldsmith self-assigned this Nov 27, 2023
@MikeGoldsmith MikeGoldsmith requested review from a team and lizthegrey November 27, 2023 14:07
Copy link
Member

@lizthegrey lizthegrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we land this, I suggest we actually try this in hound via an override and make sure it has the allocations benefit we expect

@JamieDanielson JamieDanielson changed the title feat: Add LRU cache for field names feat: Reduce allocations by using LRU cache for field names Nov 27, 2023
@MikeGoldsmith
Copy link
Contributor Author

MikeGoldsmith commented Nov 27, 2023

When testing, a panic happened if AddField was called before Init (or never called). This isn't an expected under normal usage but happened in tests so it's possible and shouldn't cause a panic. I've moved the cache creation to a check within the func that does the lookup, creating the cache if nil.

@lizthegrey
Copy link
Member

Unfortunately, this appears to contend the RWMutex inside the LRU cache, making this operation much slower than just doing the RAM allocation :(

Copy link
Member

@robbkidd robbkidd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under load, lock contention updating the cache was worse than the memory allocations for a concatenated string. We need to reconsider this implementation.

@cartermp
Copy link
Member

Another approach could be to only add the prefix if it doesn't already exist, and then go through various places in shepherd and update the keys we pass in to include app.. Maybe we can avoid several instances of string allocations that way? I'd also be curious which callers are the ones showing up as contributing the most to memory use, and focusing just on those.

@MikeGoldsmith
Copy link
Contributor Author

Ah, the lock contention being an issue for high volume make sense.

I agree with @cartermp - likely the best solution is to check if the key already has the app. prefix and only add if not. Then update docs to say a perf. improvement would be to prefix your names to save the allocation.

@MikeGoldsmith MikeGoldsmith changed the title feat: Reduce allocations by using LRU cache for field names feat: Reduce allocations when adding fields by checking for app. prefix first Nov 28, 2023
@MikeGoldsmith
Copy link
Contributor Author

MikeGoldsmith commented Nov 28, 2023

Also, I found that we already had benchmarks for AddField so I've added the following benchmarks:

  • when the key already has the app. prefix (best case scenario)
  • when the key is always unique (worst case scenario)

Added results to PR description.

@MikeGoldsmith MikeGoldsmith changed the title feat: Reduce allocations when adding fields by checking for app. prefix first feat: Improve speed and reduce allocations when adding fields Nov 28, 2023
@kentquirk
Copy link
Contributor

kentquirk commented Nov 28, 2023

I did some experimentation here.

@kentquirk
Copy link
Contributor

kentquirk commented Nov 28, 2023

After further experimentation, it does not seem to be reasonable to have a shared cache that does not have a significant performance bottleneck when trying to limit the size of the cache. (It's probably possible, but not without a lot of tricky code.) And not limiting the cache size might cause a major memory hit in the case of a customer who accidentally uses a high-cardinality value for a column name.

I no longer think this approach is worthwhile; a small number of short-duration allocations seems preferable to the locking problems from a shared cache. I'd suggest that we close this issue.

@MikeGoldsmith
Copy link
Contributor Author

Introducing a cache is showing to have too much lock contention, closing this in favour of #406.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: blocked type: enhancement New feature or request version: bump minor A PR that adds behavior, but is backwards-compatible.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

perf: "app." + key is allocating new objects still
5 participants